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Electron Beam Heating of a Thin Film 
on a Highly Conducting Substrate 


By J. A. MORRISON and S. P. MORGAN 
(Manuscript received January 27, 1966) 


An analysis 1s made of the steady-state temperature distribution in a 
poorly conducting plane film on a highly conducting semi-infinite substrate, 
owing to a time-independent heat input in a cylindrical region of the film 
and substrate. The problem is of interest in connection with the localized 
hardening of anodic oxide films on silicon by electron beam bombardment 
in order to produce oxide diffusion masks for the manufacture of integrated 
circuits: A formal solution is obtained for arbitrary dependence of the heat 
input on radius and depth, and a detailed study is made of a particular 
case in which the heat input is independent of radius across the beam, and 
varies in a realistic manner with depth in the film. Approximate formulas 
are given for the temperature in the film when the radius of the beam is 
large compared to the thickness of the film, and also when the conductivity 
of the film is small compared to the conductivity of the substrate. The ap- 
proximate formulas are compared with the results of calculations based on 
the exact solution. Finally, a crude estimate 1s made of the time required 
to reach the steady state. 


I. INTRODUCTION AND SUMMARY 


Recently considerable interest has developed in the application of 
electron beam technology to microelectronics. A number of papers 
have been concerned with the heat-flow problems encountered when a 
high-power electron beam interacts with a target. Previous investiga- 
tions have considered electron heating of a uniform semi-infinite target,? 
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of a target consisting of a highly conductive metal film on a less con- 
ductive substrate,’ and of a thin film not supported by a substrate.‘ 
The heating of a poorly conductive film on a highly conductive substrate 
does not appear to have been treated before, and forms the topic of the 
present investigation. It corresponds to the case of an electron beam 
incident upon an oxidized silicon substrate. 

This analysis may find an application in the fabrication of oxide dif- 
fusion masks for integrated circuits by electron beam bombardment.® 
The etch rate of an anodic oxide film on silicon in hydrofluoric acid has 
been shown to decrease strongly under electron bombardment, and a 
proposal for producing patterns is to harden some areas of the film and 
then to remove the surrounding oxide with dilute HI’. 

It should be noted that electron beam bombardment produces radia- 
tion damage as well as thermal effects. The radiation damage alone would 
increase the etch rate of the film in HF, but in conjunction with high 
temperature it also facilitates ionic rearrangement in the SiOz film dur- 
ing irradiation, which leads to a decrease in etch rate. The latter effect 
predominates by far in the case of anodic SiOz: films, so that the net 
result is a strong decrease in etch rate. 

While the rise in temperature during irradiation is thus not the only 
factor contributing to the “hardening” of the oxide film, it is still the 
major factor, and a knowledge of the temperature distribution during 
irradiation is highly desirable. On the one hand one is interested in 
working at a high temperature in order to increase the rate of oxide 
“hardening”; on the other hand one must stay below the melting point 
of silicon (1415°C), or perhaps even lower in order not to generate ex- 
cessive thermal stresses in the silicon. The edge definition of the hard- 
ened region in the oxide film is also of paramount interest for mask fabri- 
cation. Effects due to radiation damage will not be considered here, 
but it may be noted that radiation damage will be generated exclusively 
in the oxide and not in the silicon at the accelerating voltages of interest 
(less than 10 kv). 

In this paper we consider the mathematical problem of calculating 
the steady-state temperature distribution due to an axially symmetric, 
time-independent heat input throughout a cylindrical volume of the 
film and substrate. The thermal properties of both materials are assumed 
independent of temperature, and radiation from the outer surface of the 
film is neglected. A formal solution of the problem is given in Section 
II for an arbitrary dependence of the heat input on radius and depth; 
but in the subsequent analysis we assume that at any given depth 
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the heat input is independent of radius across the beam and zero out- 
side the beam. We also confine our attention to the temperature dis- 
tribution in the film itself. The only thing we really need to know about 
the temperature in the substrate is that its maximum, which occurs 
on the axis at the film-substrate interface, is not high enough to melt 
the substrate. 

With a fixed distribution of input heat, the normalized temperature 
distribution in the film depends on two dimensionless parameters, namely 
the ratio of beam radius to film thickness and the ratio of film conduc- 
tivity to substrate conductivity. In the physical problem, the beam 
radius may be several times the film thickness, and the conductivity of 
the oxide film is between a tenth and a hundredth of the conductivity 
of the silicon substrate. In Section III an asymptotic approximation is 
given for the temperature distribution when the normalized beam radius 
is large. Section IV contains the solution for a perfectly conducting sub- 
strate, as well as an estimate of the first-order effect of finite but large 
substrate conductivity. 

In order to calculate the temperature distribution numerically, it is 
necessary to assume a definite dependence of heat input on distance 
into the film (‘‘depth-dose function’’). In Section V we assume a depth- 
dose function which approximates the form determined empirically by 
Griin® and also employed by Wells.? The parameters are adjusted so 
that the power input is maximum at a depth equal to 40 per cent of the 
film thickness and zero at the bottom of the film, since, in general, 
one wishes to avoid direct heating of the substrate by the electron 
beam. Contour plots of normalized temperature have been calculated 
from the formulas of Section II for selected beam diameters and con- 
ductivity ratios. In addition, the exact temperature distributions along 
the axis and at the top and bottom of the oxide film are compared with 
the approximate formulas of Section III. 

As a typical numerical result, we find that for an SiOz film of thickness 
0.5 micron, bombarded by a 5 kv electron beam of diameter 20 microns 
with a current of 628 wa and a uniform power density of 10° watts/cm?, 
the steady-state temperature rise on the axis is about 1800°C at the 
surface of the film, and about 800°C at the surface of the silicon sub- 
strate. 

In Section VI a crude estimate is made of the time required to reach 
the steady-state temperature after the electron beam is instantaneously 
switched on. It appears that in an example such as the preceding, the 
transient time would be of the order of a few tenths of a microsecond, 
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II. STEADY-STATE TEMPERATURE DISTRIBUTION 


The geometry of the problem to be considered is shown in Fig. 1. 
A plane film of thermal conductivity K, fills the region 0 S z S c, and 
overlies a semi-infinite substrate of thermal conductivity K2 which 
fills the region z < 0. We wish to find the steady-state temperature rise 
T (r,z) under the influence of an axially symmetric, distributed heat 
source of strength Q(7,z). 

The oo rise satisfies Poisson’s equation, 


10T , oT —Q/Ki, 0<z<e, 
a ptieet ae w See 2<0, (1) 
and the boundary conditions are 
T.(r,¢) = 0, 


Tro") =TC,0), 


(2) 
K,T.(r,0°) = KeT.(r,0), 


T(rz) 70 as rt2—ooa, 


where 7, denotes 07'/dz. The first of the boundary conditions asserts 
that there is no heat flow across the upper boundary of the film. The 
method of solution which we are going to use would also allow for a 
linearized radiation condition at the surface, i.e., a linear relation be- 
tween 7T'(r,c) and T,(r,c), if one knew the appropriate coefficients. The 
second and third conditions insure the continuity of temperature and 
heat flow across the interface between film and substrate, and the 
fourth condition says that the temperature rise tends to zero at great 
distances from the source. 


Ky | | TZ) 
WDCDD?KWWCUCG CG 
Ko To(¥,Z) 


Fig. 1 — Cross section of plane film on semi-infinite substrate. 
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It will be convenient henceforth to work in terms of normalized, 
dimensionless quantities. In particular, we shall take the film thickness 
as the unit of length, and denote the ratio of film conductivity to sub- 
strate conductivity by ¢«. We also introduce a representative heat 


source strength Q) having the dimensions of power per unit volume. 
Thus, let 


€ = r/c = normalized radius 
¢ = 2/c = normalized depth 
é = K,/K2 = conductivity ratio 
gq = Q/Qo = normalized heat input 
U = (Ki/e'Qo)T 


In terms of normalized quantities, (1) takes the form 


normalized temperature rise. 


1aU , dU _ f—q 0-61, 
Sati oe oe 7 {o 5 <0, 8) 
and the boundary conditions (2) become 
U;(&,1) = 0, 
U(&0") = U0), 
(4) 


eU;(E,0*) = U;(é,0), 
UES) 70 as FTPs m, 
In what follows, we shall treat separately the cases of heat input to 
the film and heat input to the substrate. The general case follows by 
superposition. 
2.1 Heat Input to Film 


Assume that q(é,¢) differs from zero only in the film. We denote the 
normalized temperature rise in the film by Ui(é,¢), and seek a solution 
of Poisson’s equation in the form 


ules) = | sms) ilwedv, OS 651, (5) 


where f(w,¢) is a function to be determined. Similarly, for the normalized 
temperature rise in the substrate, U2(é,¢), we seek a solution of Laplace’s 
equation in the form 


Olt) = [ g(weJo(we)dw, ¢ <0, (6) 


which vanishes asf > —©. 


666 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1966 


We shall consider the case in which the normalized heat input can be 
written as 


q(é5) = ve), (7) 


that is, as the product of a function of radius times a function of depth. 
This will probably be justifiable if the increase in beam width with 
depth due to electron scattering is negligible. Furthermore, by taking 
y(~) and o(¢) equal to 6-functions it is possible to derive the Green’s 
function, in terms of which one can express the solution for an arbitrary 
axially symmetric heat input. 

Substituting (5) and (7) into (8) and making use of Bessel’s equation, 
we obtain 


[ UeeCongy — wong Wale tw = —Y(e)e(8), 3 


for Oar <1, 
From the Hankel inversion formula,® it follows that 
fr(ws) — wf(w,c) = —wb(w)e(), (9) 
where Y(w) is the Hankel transform of »(£), defined by 


bw) = [eve )solwedat (10) 


Extensive tables’ of Hankel transforms are available; we note in 
particular the following pairs. 
(t) Uniform beam of normalized radius a: 


I 


v(e) fe 0<t<a, 


0, §&>a, (11a) 
(c/w) Js (ew). 


¥ (w) 
(77) Gaussian beam: 
v(E) = exp (—£/o’), 
Y(w) = (a°/2) exp (—a’w’/4). 
(aii) Infinitesimally thin, hollow beam of radius £5 : 
¥(E) 5(E — £0), 
P(w) = foJo(wéo). 


I 


(11b) 


(llc) 
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To satisfy the boundary conditions (4), we must have 
few) = 9, 
f(w,0) = gw), (12) 
efc(w,0) = wg (w), 
from which, eliminating g(w), 
fs(wl) = 0, — éfc(w,0) = wf(w,0). (13) 


The solution of the two-point boundary value problem for f(w,¢) by 
standard methods leads to 


OTe E we + e cosh we 


cosh w+ esinhw /o y(n) cosh w(1 — n)dy 


; (14) 
— [ oln) sinh w(¢ = aan. 
From the second of (12), 
. ey (w) ‘ 
g(w) = stars | g() cosh w(1 — »)dn. (15) 


In principle, the normalized temperature rise is completely given by 
(5), (6), (10), (14), and (15), provided that the heat input to the film 
can be represented as a product ¥(é)¢(¢), and that there is no heat 
input to the substrate. A complete numerical solution for arbitrary 
y and ¢ would, however, involve the evaluation of five integrals, each 
of which depends on one or more parameters. In practice, one would 
try to approximate the source function in such a way that at least some 
of the integrations could be done analytically. An example is discussed 
in the following sections. 


2.2 Heat Input to Substrate 


Again we take the heat input in the product form (7), but now assume 
that ¢(¢) differs from zero only in the substrate, ¢ < 0. For the norma- 
lized temperature rise we assume 


wiles) = | Uw) cosh w(t — E)Jolwedw, OS ¢ <1, (16) 


IIA 


ves) = | "Gn T Gide: 6 <0, (17) 
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in the film and substrate, respectively, where /(w) and m(w,¢) are func- 
tions to be determined. The expression for Ui(é¢) already satisfies 
Laplace’s equation and the first of the boundary conditions (4). 

As before, it is easy to show that m(w,¢) must satisfy the differential 
equation 


me (wf) — w'm(w,f) = —ewsw)e(), (18) 
and the boundary conditions 
L(w) cosh w = m(w,0), 
—el(w) sinh w = m;(w,0), (19) 
m(w,¢) 30 as to, 


By standard methods we find 


m(w,t) = D(w)e™ + teh (w) [ exp (—w | — £[) o(n)dn, (20) 


for c= GU, 


It is clear that m(w,¢) satisfies the last of the boundary conditions (19) 
if w > O and g(¢) vanishes for all sufficiently large negative ¢. In prac- 
tical cases, it will certainly be justifiable to set the heat input identically 
equal to zero below some finite depth. We shall not take space to in- 
vestigate the mathematical question of how slowly ¢(¢) could approach 
zero, and still have m(w,¢) also approach zero, as f > —o. 

From the first two boundary conditions (19), it is straightforward 
to calculate 


Uw) = — 4) __ exp (wn) o(n)an, (21) 


cosh w + € sinh w «x 


Pa cosh w — esinhw | f° 
D(w) = Joplw) [SRV ESM’ Lexy (wn)olnddn. (22) 


The first of these, substituted into (16), gives the normalized tempera- 
ture rise in the film; and the second, together with (17) and (20), gives 
the temperature rise in the substrate. 


III. APPROXIMATIONS FOR A UNIFORM BEAM OF LARGE RADIUS 


We shall consider henceforth only the case in which the heat input is 
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radially uniform out to the normalized radius = a, and zero for — > a. 
Also we shall be interested only in the temperature distribution in the 
film itself. The dependence of heat input on depth will, however, still 
be taken as arbitrary. 

In the case where heat is applied to the film by means of a radially 
uniform beam which does not penetrate the substrate, the normalized 
temperature rise in the film is given by (5), (11a), and (14). In practice, 
the SiO, film may be only half a micron thick while the beam radius is 
several microns. We accordingly seek an asymptotic expansion of the 
temperature distribution for large a. In the analysis we assume that the 
conductivity ratio ¢ is fixed with e < 1. Our results will also include the 
physically interesting case e < 1. 

Combining (5), (1la), and (14), we may write the temperature dis- 
tribution in the film in the form 


Ui(¢5 6) = [ Ee g(n)dn + h(w,f; | Ji(aw)Jo(éw) dw, ass 


0 1, 


IIA 
lA 


g 


where 


sinh we + € cosh wt f’ 
he Hehe bo one 


—€ [ etnan — [ sinh w(¢ — nd |. 


n)dn 


h(w,f; €) == 
. (24) 


It is clear that the function h(w,¢; ¢) may be expanded in a power 
series around w = Q; that is, 


h(wgse) = Do h™ (0,65 e)w"/ml, (25) 


where the superscripts denote derivatives with respect to w. In particu- 
lar, 


HO, 8) = a| ( — 2) [olaan— fe - notaan |. (26) 


Let 


p= &a, & = ap, (27) 
so that the boundary of the heat input region is p = 1. We have” 
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Pio) = a J1(aw)J (paw) 7 


Il 


i Ji(x) Jo( px) z 


= H(), OSp <i, 


“fe -(-DaQh o> 


where 7 and K are complete elliptic integrals. 

Furthermore, assuming that p, ¢, and e are fixed, the following asymp- 
totic expansions for large @ are derived in the Appendix, in terms of the 
derivatives of h(w,¢;¢) at w = 0: 


[wg F(a) Fol pono) cv 
soe g) 4 > (—1)"P(n + $) 


mo TO) + 1) 
ne"? (0,t5 €) 


‘Fin + $n + 3; 1; p) ont? ? 0OSp<il, 
29 
S (= "Tn + 8) (29) 
n=0 '(3)T (nm + 1)p?? 
1\ pet? (0.¢: 
F (n+ dn + a4) 6 e | p>, 


where I’ (a,b; c; 2) is the hypergeometric function. Since when p is near 
unity, we have” 


Fint+3,n+ 45150) 


I 


O[ (1 ~~ ries 


0< d-»p) <1, 
| (30) 
F (n Se TL ae eee *) = O[(p — 1)7°"""), 


0<(p—1) <1, 


it follows that the asymptotic expansions (29) are useful only for 
a|1— p|> 1, but not in the neighborhood of p = 1. 
Combining (23), (26), (28), and (29), we obtain finally, 
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Ges) SPD i (ae Cex) i Beye 


¢ 
a i (¢ — n)o(n)dn + Ole/a(1 = p)|, 


OS p< 1; ey 


Ui (pat; €) ~ eaP(p) i o(n)dn + Ole/a(p — 1)p\, 


p> |, 


where P(p) is defined by (28). That the remainder terms are O(e) 
when ¢ < 1 may be seen from the relations 


h(w,f;¢e) = h(—w,f; —e), 


(32) 
her O,¢3€) = —hO" Of; -e); 


i.e., the odd derivatives of h with respect to w at w = 0 are odd functions 
of ce. Note, however, that setting « = 0 in the asymptotic solution (31) 
does not give the exact solution’ for a perfectly conducting substrate, 
inasmuch as there are exponentially small terms in a which never appear 
in the asymptotic solution. The exact solution for ¢ = 0 is given in 
Section IV. 

When the product ee is sufficiently large, the leading terms in the 
asymptotic solution (31) are proportional to P(p); that is, they are 
functions of p (= £/a) only, and are independent of the depth ¢ in the 
film. The function P(p) is plotted in Fig. 2. It is continuous, with a 
logarithmically infinite slope, at p = 1. Numerical comparisons between 
the exact solution (23) and the approximate solution (31) are made in 
Section V. 

We now look briefly at the case of heat input to the substrate by a 
radially uniform beam of normalized radius ae and depth dependence 
o(¢), for ¢ S$ 0. The normalized temperature rise in the film is, from 
(Ila), (16), (21), and (27), 


ke, * gosh w(1 — ¢) lf. | 
U1(pa,f; €) = ca | w(cosh w we ésinh w) ES exp (wn) y(n)dn (33) 

-Ji(aw)Jo( paw) dw, 0<¢<l. 
When p, ¢, and ¢ are fixed, and both a> 1 and a| 1 — p|>>1, an analy- 
sis entirely similar to the preceding gives 
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0.6 


P(p) 





0.4 

















Fig. 2 — The function P(p) = 


He I(x) Jo(px) ies 
zw 


0 


Ui(pa,f; €) ~ eaP(p) ie o(n)dn +e [ (n — €)o(n)dn 


+ Ofe/a(l1—p)], OSp<1; 
(34) 


Uilpasss ©) ~ caP(o) | o(n)dn + Ole/aCo — 1)s'h 


p>l. 


IV. APPROXIMATIONS FOR A UNIFORM BEAM WITH LARGE SUBSTRATE 
CONDUCTIVITY 


We now assume that ¢ < 1; that is, the conductivity of the substrate 
is large compared to the conductivity of the film. (For an SiO, film on 
silicon, ¢ is between 0.1 and 0.01.) Again we consider a radially uniform 
beam, with an arbitrary depth-dose function ¢(¢). No restrictions are 
placed on the normalized beam radius a. 

For heat input to the film only, the normalized temperature rise in 
the film is given by (5), (11a), and (14). Referring to (14), we expand 


BEAM HEATING OF A THIN FILM 673 


the function f(w,¢) in powers of ¢. After a little algebra, we find that the 
normalized temperature rise in the film can be written as 


Ti(ésse) = Dee ES), (35) 
where 
Ui (ES) = a | Flws)S(aw) Fol gwd, (36) 
with 
1 : 
F(w,) = Ea sinh we [ ¢(n) cosh w(l — )dn 
; (37) 
+ cosh w(1 — ¢) i y(n) sinh wy in| : 
and forn = 1, 
roe) 1 
nce) = (1 [| fon) cosh w(t = aden] 
0 0 (38) 


cosh w(1 — ¢) tanh” w dw 
Seeger Eg 


x 

The quantity U, (én) is the normalized temperature rise in the film 
when the substrate is perfectly conducting. In this case, the temperature 
rise at the bottom of the film is zero, and in fact it is obvious from (37) 
that F(w,0) = 0. The quantity U,° (é¢) represents the nth order 
correction if ¢ is small but not zero. 

We may evaluate U;(é,¢) by contour integration. Let S, denote the 
semicircle of radius nz in the upper half-plane (n = 1, 2, ---), with 
diameter along the real axis indented at the origin. From (87) it follows 
that F(w,f) is uniformly bounded on S, , and has simple poles within 
Sn atw = (m — 3) (m = 1, --- ,n). The choice of integrand depends 
on whether 0 S$ é Sa, oré 2 a. For 0 S £ S a we consider 


[, Fog) ogo) (eav)eto, (39) 


and for £ = a we consider 


[ F(w,t)Ji(aw) Ho (ew) dw, (40) 


where H,” and H," are Hankel functions. The integrals are evaluated 
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by the calculus of residues and then the limit n — © is taken. Since the 
procedure is a standard one, we omit the details and merely state the 
results. 

We obtain, for 0 S — S a, 


{0:6 _ E iq g(n)dn + [ noln)an | 


— 2a sin [(m — 3) a6] 4 se 
= = >» pay Tol(m 1) él Kil (m 2 ) rar] (41) 


| y(n) sin [(m — 4) maldn, 


and, for € = a, 


(0) _ 2a sin [(m — 4)z¢] 
(42) 


Tin Delonte [ bG) cara, 


Here J), 1,, Ko, and K, are modified Bessel functions. The continuity 
of dU,° /dé at € = a@ is readily verified, while that of U;° at & = a 
follows from the identities 


Io(a) Ki (a) + T(x) Ko(xz) = 1/2, (43) 


and 


cos [(m — 4)26] 7 : 


Tv 

m=1 (m ae 3)? 2 

We remark that (41) and (42) could have been derived by separation 

of variables in (8). For |a@ — &| >> 1 the right-hand sides of (41) and 

(42) are exponentially small. It follows that (41) and (42) are con- 

sistent with the asymptotic expansions (31) and (82) for a > 1, when 
€ = 0 and asymptotically small terms are neglected. 

It does not appear possible to evaluate the first-order correction 
U," (&¢), as given by (38), using contour integration, because the 
integrand has the wrong parity in w. We can, however, obtain a bound 
on the value of U,™ (0,0), at the “hot spot” of the film-substrate inter- 
face, where the zero-order solution U,“ (0,0) vanishes. 

From (388), setting n = 1 and changing the order of integration, 
which is justified since the double integral is absolutely convergent, 


0:00) = « | (| [Rea a) Le) ay lin, (45) 
0 0 WwW 


cosh w 


(1 — 6), 05060 82. (44) 
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When 7 = 0, the inner integral is equal” to 1. When 7 > 0, we trans- 
form the inner integral by substituting the integral representation” 


1/2 
Ji(aw) = (2aw/r) [ cos (aw cos 6) sin’ 6 dd, (46) 
0 
and again invoking the absolute convergence of the double integral to 


change the order of integration. This leads to 


i cosh w(1 — 7) Ji(aw) Pe 
0 cosh w w 


a/2 2 i 
_ 20 | cosh w(1 — 9) cos (aw cos advo sin’ 6 dé 
0 0 


T cosh w 


I 


a) ie cosh (47a cos 0) sin’ 6 dé (47) 
0 


a sin — Sciacca Sg ee 
2 sinh? ($7a@ cos 6) + sin? (377) 


eam [- cosh (47a cos @) sin 6 dé 
2/9 sinh? ($7a@ cos 6) + sin? ($77) 


; 1 
Bad [inh Gee) | <1, 
T sin (477) 


A 


where the third line follows from a table of Fourier transforms.” Hence, 
finally, from (45) and (47), 


oe (0,0) 


ow _ ant [7 cosh (42a cos 6) sin’ 6 dé 
=a | o(n)sin— ooo Ian 
0 2 Lo sinh? (47a cos@) + sin? (477) 


< a | o(n)dn. (48) 


We see from (81) that asymptotically, for @ > 1, the upper bound in 
(48) is attained, since P(1) = 1. 

Now suppose that heat is put only into the substrate, so that the 
normalized temperature rise in the film is given by (83). Expanding in 
powers of ¢ leads to 


Uilét;e) = 2, AU Ee), (49) 


where 


n “(—1)"" cosh (1 — 
Ui (Es) = 2 | (—1)"~ cosh (1 — ¢)w tanh"™ w 


w cosh w 





0 (50) 
x hs exp (on)e(n)an | J (aw) Jo( Ew) dw. 
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In particular we have, on changing the order of integration, 


ui, (00) = af ola) | [SP OPA) gy hig 


(51) 
= | ola)la? + 0°)! + alan, 


after substituting the known value! of the inner integral. 


V. NUMERICAL RESULTS 


In this section we give the results of some numerical computations 
using the exact formulas of Section IJ, and some comparisons with the 
approximations of Sections III and IV. We asume that the electron 
beam voltage is such that the electrons penetrate to the bottom of the 
oxide film (about 5 kv for an 0.5 yu film7-*), but do not enter the sub- 
strate. For the depth-dose function we take 


of) = sinff, B=5r/6, O85 S$], (52) 
which is plotted in Fig. 3. The assumed depth-dose function vanishes at 


TR 
Ly 


fo) 0.5 1.0 
HEAT INPUT, $(¢) 


NORMALIZED HEIGHT, ¢ 


Fig. 3 — Heat input function g(¢) = sin 57/6. 


BEAM HEATING OF A THIN FILM 677 


the bottom of the film, has a maximum at a depth equal to 40 per cent 
of the film thickness, and in general corresponds very closely to the 
empirical function used by Griin® and Wells.’ The position of the maxi- 
mum. could be varied, of course, by changing the parameter £. 

Substituting ¢(¢) into (5), (1la), and (14), we find after some al- 
gebra that the normalized temperature rise in the film is given by 


Uy (é,0) = aV (&) sin BE = aBbW (é,¢), (53) 
where 
= ” Ji(aw) Jol Ew) 
vie) = f Pees aw (54) 
and 
* Ti(aw)Jo( Ew) 
W(éo) a Type i BeN 
w(w + 6) (55) 


{s cosh(1 — ¢)w — (sinh ¢w + e€ cosh ¢w) cos P| aw. 
cosh w + € sinh w 


The integral on the right side of (54) can be expressed in terms of 
modified Bessel functions. We have'’ 


* wJ (aw) Ji(tw) _ I,(Bt) Ki(Ba), t 
0 (w? + 8) ~ |h(Ba)Ki(gt), 


Integrating both sides with respect to ¢ from a to é and using the rela- 
tionship” 


a, 


IIA 


(56) 


Qa. 


IV 


Si(aw)Joow) , _ TiaB)Kola8) 
[ OEE gy = (57) 
we obtain 
‘| 2 K(Ga)10(68) |, 122 
V(é) = (58) 
T:(Bo) Ko(Bé) . 
eee tz a 


The integral for W(é,¢), on the other hand, has to be evaluated 
numerically. The integrand is oscillatory, and falls off exponentially for 
large wif0 <¢< LIf¢ = Oor¢ = 1, it falls off like 1/w’ if & ¥ 0, 
and like 1/w’” if € = 0. The numerical integration was done by Simp- 
son’s rule on an IBM 7094 computer. Combined analytic and empirical 
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investigations of the accuracy were made in order to guarantee that the 
relative error in any value of U is less (in most cases, much less) than 
one per cent. 

Four different sets of parameters were chosen; namely, a = 2, 10, and 
20 with e = 1/40, and a = 2 withe = 1/4. The normalized temperature 
rises at the surface of the film and at the film-substrate interface are 
plotted against normalized radius in Fig. 4. Note the differences in 
scale; in each case the edge of the beam, & = a, is at the center of the 
plot. The temperature distribution along the vertical axis is shown for 
the same four cases in Fig. 5. 

It is seen that the temperature distribution at the surface of the film 
becomes more flat-topped, and the fall-off at the edge of the beam 
becomes relatively (although not absolutely) mote abrupt as a increases 
in the first three cases of Fig. 4. Also note that the temperature levels are 
somewhat higher and the temperature variation through the film is less 
in Fig. 4(d) than in Fig. 4(a), since for the same value of a the relative 
conductivity of the substrate is only 75 as large in Fig. 4(d) as in Fig. 
4(a). 

The dashed curves in Figs. 4 and 5 correspond to the approximate 
formulas (31) for large a. If o(¢) is given by (52), these approximations 
read: 





Ui(pa,f; €) ~ eaP(p) poe 
+ ; Ee — ¢cosB — &(1 — cos a) | : (59) 
0=p<il, 
Ui(pa,t; €) ~ eaP(p) as p >t, 


where p = &/a. As expected, the approximations are discontinuous at 
the edge of the beam, p = 1; and they are not much good when a = 2 
(worse for the larger value of ¢). They are remarkably good, however, 
for a = 10 and a = 20; the dashed curves essentially coincide with the 
solid ones except on the surface of the film in the immediate neighbor- 
hood of the beam edge. 

Contour plots for the temperature distribution in the film are given 
in Fig. 6 for a = 2 and aw = 10 with e = 1/40, and for a = 2 with 
€ = 1/4. Contour plots were not made for a = 20, because the numerical 
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Fig. 4-— Normalized temperature rise at upper and lower surfaces of film: 
(a) a = 2,€ = 1/40; (b) a = 10,€ = 1/40; (c)a = 20, e = 1/40; (d) a = 2,€ = 1/4. 


integration is slow for large a (the integrands oscillate more rapidly); 
but it is clear that for large a the approximate formulas (59) would 
yield accurate contours, except very close to the beam edge. 

We may also compare the bound on the first-order correction term 
for small ¢, as given in Section IV, with the exact results. At the center 
of the film-substrate interface, (48) and (52) give 


eU," (0,0) < ea(1 — cos B)/B = 0.713¢ea, (60 ) 


which leads to the following comparison with the exact solution U; (0,0). 


NORMALIZED HEIGHT, ¢ 
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Fig. 5 — Normalized temperature rise along axis: (a) a = 2, € = 1/40; (b) 
a = 10, ¢ = 1/40; (c) a = 20, € = 1/40; (d)ae = 2,€ = 1/4. 


€ a U,(0,0) 0.713€« 
1/40 2 0.0314 0.0356 
1/40 10 0.1770 0.1782 
1/40 20 0.3556 0.3564 
1/4 2 0.2874 0.3564 


In order to relate the dimensionless temperature rise U; to the physical 
temperature rise 7, it is convenient to introduce the power density 
(i.e., per unit area) in the incident beam. For a uniform beam of radius 
ac with depth-dose function ¢(z/c), the dimensional factor Qo, which 
normalizes the heat input per unit volume (Section II), is related to 
the incident power density Po by 
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| 


AE Pe BOs | aor eeae. (61) 


d= Po / of ols)ar. (62) 


Hence, the actual temperature rise at the point (7,2) of the film is given 
by 


Ti(r,z) = (¢'Qo/K1) U1 (r/c, z/c) 
= (1.408cP)/K,)Ui (r/c, 2/c), 


(63) 
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Fig. 6 — ei contours. (a) a = 2, € = 1/40; (b) a = 10, € = 1/40; (ce) 
4, 


a=2,e= 
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where the numerical coefficient corresponds to the depth-dose function 
(52). Consistent units for (63) are: 


T, = temperature rise in °C 
r,z = coordinates in cm 


c = film thickness in cm 
Py = incident power density in watts/em* 
K, = film conductivity in watt/em °K 
0.239 K, = film conductivity in cal/sec em °K. 


We shall now look at a numerical example. It turns out that the 
constant-conductivities model which has been analyzed in the present 
paper is not a very good approximation to the real problem of a silicon 
dioxide film on a silicon substrate, because the thermal conductivities 
of both materials depend strongly on temperature. In fact, the conduc- 
tivity” of SiO. increases from about 0.015 watt/cm °K at room tempera- 
ture (various values are reported — not for thin films — which differ 
among themselves by as much as 2:1 depending on the crystalline orien- 
tation of the sample) to about 0.03 watt/em °K at 900°C. For Si, on the 
other hand, the conductivity” decreases from about 1 watt/em °K at 
room temperature to about 0.03 watt/em °K at 900°C. For purposes of 
calculation we shall more or less arbitrarily assume the values 


Ky, = 0.03 watt/em °K 

Ke = 1.2 watt/em °K 

e = 1/40 (64) 
¢ =054=5X10° cm 

Py = 10° watts/em’. 


Since these conductivities may be somewhat larger than the actual 
conductivities, the temperatures which we shall compute may be some- 
what lower than the actual temperatures. For a 5-kv beam, the assumed 
power density corresponds to a current density of 200 amps/cm’. 

Table I gives the total beam current and the maximum temperature 
rise (i.e., on the axis) at the top and bottom of the film, for beams of 
diameter 2 », 10 », and 20 un, corresponding to the previous computations 
with a = 2, 10, and 20, and a conductivity ratio of 1/40. It appears, 
therefore, that in each case at least a part of the irradiated spot would 
be raised to the temperature at which the oxide hardens (900°C), but 
in no case would the substrate melt (1415°C). 

It is probably worth repeating that the physical problem of interest 
is nonlinear, because of the dependence of conductivity on temperature. 
Bounds on the solution may be obtained from linear models, by using 
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TABLE I 
Diameter Current Ui(0,1) U1(0,0) T1(0,c) 7T1(0,0) 
Qu 6. 28ua, 0.3857 0.0314 902° 73° 
10u 157pa 0.5799 0.1771 1356° 414° 
20u 6§28yua, 0.7589 0.3556 1775° 832° 


the theorem that with a fixed heat input the steady-state temperature 
is not increased anywhere (usually, it is decreased everywhere) if the 
conductivity is increased anywhere, and vice versa. However, only a 
full-dress numerical treatment of the nonlinear partial differential 
equation, assuming that one knew the temperature dependence of the 
conductivity, would be likely to yield really accurate results. 


VI. TRANSIENT EFFECTS 


It is of interest to know how long it will take to reach the steady 
state if the electron beam is suddenly switched onto the film, since this 
gives an idea of how rapidly the beam may be scanned in laying out a 
mask. There have been some published analyses’:”! of transient heating 
effects in electron beam machining, but we shall content ourselves with 
a crude estimate of the time scale in the present problem. 

Consider the case of a film on a perfectly conducting substrate, with 
the film initially at zero temperature, and with a time-independent heat 
input starting at ¢ = 0. Then the instantaneous temperature distribution 
satisfies the heat flow equation 

OT Ot Lee OQ) a Cler 


—~ — = —- Ste 65 

ar? Ta ror 02 Ki, Ky at’ co 
where kK, is the thermal conductivity, C the heat capacity, and 6 the 
density. The total temperature 7’ (r,z,t) may be written as the sum of a 


steady-state part and a transient part, 
T (r,2,t) = 7T, (7,2) ee O(r,2,t), (66 ) 


where 7 (7,2) satisfies Poisson’s equation (cf. Section II) and 0(r,z,t) 
satisfies the homogeneous equation 


veo ,100,00 C6240 
ee par oe Kar ep) 


and vanishes as t > ©. 
A sufficiently general solution of (67) may be written in the form 
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O(ret) = Of An(w) exp (—(Kr/Ca)lw* + (m + $)*x'/elt 
m=0 40 
Cra eae 
xX Jo(wr) sin ae dw, 
The initial condition requires that the total temperature vanish at 
t = 0; that is, | 


T,(r,z) + O(7,z,0) = 0. (69) 


Hence, from a knowledge of the steady-state temperature one can in 
principle use the properties of Fourier series and Fourier-Bessel integrals : 
to determine the functions A,,(w) and the transient solution @(r,z,t). 

We would like to know how fast O(r,z,t) approaches zero with in- 
creasing time. It is clear that (68) cannot be characterized by any single 
exponential decay; but we observe that the most slowly decaying ex- 
ponential is 


exp [— (Kin’/4c°C5)t], 


and it is therefore reasonable to define a crude ‘‘transient time’’ as 


r = 4008/0 Ky. (70) 
We assume the following numerical values for the SiO. film: 

c=5X10°cm 

C = 1 watt sec/gm °K (71) 
8 = 2.2 gm/cm* 

K, = 0.03 watt/em °K. 

Then 
r = 3.0 X 10°‘ sec, (72) 


so the transient time is a fraction of a microsecond. 
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APPENDIX 


Asymptotic Expansion of a Class of Integrals 


In this appendix, the asymptotic expansion of 


Hoa) = | h(w)Js(oao) Fal por) de (73) 
is derived, where 
h(w) = >> a” (0)w™/ml, (74) 
m=0 


for a >> 1 and a|1— p|>> 1, that is, for p not in the neighborhood of 
1. It is clear that the asymptotic expansions will break down in the 
neighborhood of p = 1, since then the integrand will contain a term 
which is not rapidly oscillating. 

We start from a result given by Tranter.” Namely, if we have the 
expansion 


[exp (—r0) (oma = Ana)", (75) 
then, formally, 
[F b000) (a0 de ~ = (—1)"A,(a)h(0). (76) 


In the case at hand, 
Flaw) = Ji(aw)Jo(paw). (77) 


Assume first that p is fixed and 0 < p < 1. Thenif (e°+ 7)? > pa + Y; 
we have” 


2 1 heed (+1 m amen 2 sa 9 
[exp (ro) Haw) Ja poe) = 2 Sesto eT 


ne m+1 a 
Pe (eee eae = 1.9. % 
(<3) (m+ 1, m+ iat). 


Using standard transformations,” 


(78) 
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[exp (—rw) uC) Sapa) 


1 (—1)"o"F(2m + 2) @ 
anm=o 2'r(m+ 1)" \vy 


2 
«rn 1m +4525 —S) 


* (—1)"o'"P(2m + 2)1(— 4) 
“2 03E ete DG 


l 


2 
Be (79) 
_1_ dy Sem t (mM + §) 
a Ta’? m [Tim + 1)}? 


2 
x F(m + 8,m-+ 45% —%) 


Lor S(-p" re +9 
V roe n=0 aT in + 1) 


XFnt+3n+blip), 


I 





where in the last step we have expanded the hypergeometric function 
in a power series and interchanged the order of summation. Comparing 
(79) with (75), we see that 
1/a if n=0, 
Aon (a) = ; (80) 
0 if n> 0, 


(—1)""Tin +4) 
aT) (n +1) 


It follows from (76) that 


Amsala) = Fa + $n +4; 1; p) . 


I * GaSe) Eat 


wv BO) (21) est) 
> PQ)r(n + i) 


pert (0 ) 


oe n+2 


(81) 
XFn+3,n+h le) 


forO0 <p <1. 
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An entirely similar derivation, the details of which will be omitted, 
leads to the expansion 


09 n+1 3 
I h(w)J1(aw)Jo(paw)dw ~ » Es 


porte 0 
x (n+ ijn +452 ee 





(82) 


forp > 1. 
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Predictive Quantizing Systems 
(Differential Pulse Code Modulation) 
for the Transmission of 
Television Signals 


By J. B. O’NEAL, JR. 
(Manuscript received December 27 , 1965) 


Differential pulse code modulation (DPCM) and predictive quantizing 
are two names for a technique used to encode analog signals into digital 
pulses suitable for transmission over binary channels. It is the purpose 
of this paper to determine what kind of performance can be expected from 
well-designed systems of this type when used to encode television signals. 
Systems using both previous sample and previous line feedback are con- 
sidered. 

A procedure is presented for the design of nonadaptive, time invariant 
systems which are near optimum in the sense that the resulting signal to 
unweighted quantizing noise ratios (S/N) are nearly maximum. Simple 
formulas are derived for these S/N ratios which apply to DPCM as well 
as standard PCM. Standard PCM 1s shown to be a special case of DPCM. 
These formulas are verified by digital computer simulation. 

Any advantage of DPCM stems from removing the redundancy from 
the signal to be transmitted. Redundancy in a signal, however, affords a 
certain protection against noise introduced in the transmission medium. 
The penalty for removing this redundancy, through DPCM or other means, 
is that the transmitted signal becomes more fragile and requires a higher- 
quality transmission medium than would otherwise be required. This pen- 
alty is discussed in quantitative terms. 


I, INTRODUCTION 


In this paper, the terms predictive quantizing and differential pulse 
code modulation (DPCM) will be used interchangeably. They describe 
a special kind of predictive communications system. A predictive com- 
munications system is one in which the difference between the actual 
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signal and an estimate of the signal, based on its past, is transmitted. 
Both the transmitter and the receiver make an estimate or prediction 
of the signal’s value based on the previously transmitted signal. The 
transmitter subtracts this prediction from the true value of the signal 
and transmits this difference. The receiver adds this prediction to the 
received difference signal yielding the true signal. Highly redundant 
signals, such as television, are well suited for predictive transmission 
systems because of the accuracy possible in the prediction. If the signal 
is sampled, and if the difference signal is quantized and encoded into 
PCM, then the system is a predictive quantizing or DPCM system. 

A block diagram of systems of this type is shown in Fig. 1. Although 
delta modulation which uses the feedback principle was introduced 
somewhat earlier,! DPCM systems are based primarily on an invention 
by Cutler.2 In his original patent in 1952, Cutler used one or more 
integrators to perform the prediction function. His invention is based 
on transmitting the quantized difference between successive sample 
values rather than the sample values themselves. The invention is a 
special case of a predictive quantizing system and it turned out to be 
a special case admirably matched to the statistics of television signals. 

In the early nineteen forties Weiner? developed the theory of optimum 
linear prediction. By 1952 Oliver, Kretzmer and Harrison at the Bell 
Telephone Laboratories, realized the importance of linear prediction in 
feedback communications systems and proposed that it be used to reduce 
the redundancy, and, therefore, lower the required power in highly 
periodic signals such as television. Oliver‘ explained how linear predic- 
tion could be used to reduce the bandwidth required to transmit re- 
dundant signals. Realizing that knowledge of the statistical properties 
of television signals was necessary in the design of linear prediction sys- 


TRANSMITTER RECEIVER 
€ ; Qo 






S(t) + q(t) 
OUTPUT 





Fig. 1 — Block diagram of a DPCM system. 
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tems, Kretzmer® determined some statistics of typical television picture 
material. Harrison® actually built a signal processing system for television 
signals and illustrated how redundancy could be removed from these sig- 
nals using linear prediction. Concurrently with this work at the Bell 
Telephone Laboratories, but published somewhat later, Elias? at MIT 
was developing this theory of predictive coding which explained the use 
of linear prediction in PCM systems. 

Graham’ recognized that the theory of prediction could be incor- 
porated into the system described by Cutler. Since Graham’s work in 
1958, much effort has been expended to devise and build such a system 
for the transmission of television signals. Although a few experimental 
systems have been constructed, it is a discouraging fact that such sys- 
tems have never proved to be very useful for high quality television 
transmission. Television signals are still being transmitted over trans- 
mission systems which do not take advantage of the signals’ inherent 
redundancy. 

It is the purpose of this paper to determine the advantages and dis- 
advantages of well-designed DPCM systems. Such information is 
needed in order to establish whether or not DPCM systems are poten- 
tially useful for the transmission of television signals. To do this we 
present a procedure for the design of some DPCM systems which are 
near optimum for three television scenes and determine what kind of 
performance can be expected from such systems. The results obtained 
are verified by simulating some DPCM systems on an IBM 7094 digital 
computer and using, as an input, television signals derived from a 
flying spot scanner. Our study is restricted to nonadaptive systems 
using linear prediction in the feedback loop and a quantizer whose 
characteristics do not depend on the instantaneous value of the input 
signal. Both previous-sample and previous-line feedback in the predic- 
tion operation are considered in detail. 


II, SUMMARY OF RESULTS AND CONCLUSIONS 


Some of the more important results and conclusions about DPCM 
systems designed for the transmission of television signals are enumer- 
ated below. Throughout this paper the term S/N refers to the ratio of 
signal to quantizing noise. 

(z) A simple formula for the S/N ratio is derived. If the sync pulses 
need not be transmitted, then standard PCM is shown to be a special 
case of DPCM and its S/N ratio is also given by this formula. 

(iz) When the horizontal resolution is equal to the vertical resolution, 
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line feedback, when used in addition to previous-sample feedback, can 
give no more than a 1.9-db additional improvement in S/N ratio. For 
FCC standard monochrome entertainment television the improvement 
due to line feedback will be considerably less than 1.9 db. 

(iit) Differential PCM provides more of an advantage for high reso- 
lution television systems than for low resolution systems. For mono- 
chrome entertainment television, previous-sample feedback DPCM 
transmission systems can provide a signal-to-quantizing noise ratio 
approximately 15 db higher than standard PCM. This improvement 
may easily vary as much as 2 or 3 db depending on picture material. A 
2.8-db improvement in S/N ratio can be realized in standard PCM 
systems if the sync pulses can be reconstructed by the decoder and need 
not be transmitted. The improvement of previous-sample feedback 
DPCM over sync-less PCM is, therefore, only about 12 db for mono- 
chrome entertainment television. The effect of line feedback has not 
been included in the above numbers. 

(wv) Since 6 db of quantizing noise is equivalent to one bit per sample, 
the advantage in DPCM can also be expressed in terms of bit rate. For 
a constant signal-to-quantizing noise ratio, a DPCM system designed 
for entertainment television can provide a saving of about 18 megabits 
(2 bits per sample) over standard PCM. This assumes a sampling rate 
of 9 megacycles, which is twice the bandwidth, and that the noise added 
by bit errors in the transmission medium is negligible. These bit rate 
reductions are nearly independent of the signal-to-quantizing noise 
ratios required. 

(v) A signal encoded into DPCM is more vulnerable to noise in the 
transmission medium (bit errors) than one encoded into PCM. It is 
characteristic of DPCM systems that, if they decrease the quantizing 
noise by k db over standard PCM, then the noise in the decoded signal 
caused by errors in the digital transmission channel is increased by k db. 
This penalty means that, if DPCM is used to reduce the quantizing 
noise by k& db, then the error rate in the digital channel required for 
satisfactory transmission is reduced by a factor of (1.26)*. This does 
not imply that DPCM offers no advantage. If the limiting degradation 
is quantizing noise, and this is generally true for digital systems, then 
decreasing this quantizing noise, even at the expense of increasing the 
noise introduced in the transmissoin medium, is desirable. Digital 
transmission lines designed for PCM encoding, however, may be un- 
satisfactory for DPCM encoding. This result applies to DPCM systems 
designed for any type of signal. 

(vt) The power spectrum of the quantizing noise is approximately 
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flat. The amplitude density function of the quantizing noise is found to 
be somewhat flatter than a Gaussian function. 

(vit) For television input signals the amplitude density function of 
the quantizer input in a well-designed DPCM system is approximately 
Laplacian. 

Television picture material which has meaning to a human observer 
has certain patterns which cause statistical redundancy in the resulting 
television signals. Differential PCM takes advantage of this statistical 
redundancy and the performance of DPCM systems varies with this 
redundancy. Conclusions (777) and (iv) above are based on measured 
statistics of television signals derived from three scenes which have de- 
tail typical of television picture material. 


III. PERFORMANCE CRITERION 


The performance criterion used is the ordinary signal-to-quantizing 
noise ratio, S/N, present in the video part of the composite signal. 
Noise present in the syne pulses is seldom a limiting factor in television 
transmission. While it has often been argued that the S/N ratio is not 
an adequate performance criterion for television systems, a better 
alternative for analytical study has never been proposed. Furthermore, 
when used with discretion, the S/N ratio is a useful measure in deter- 
mining the performance of television systems. It is especially useful in 
helping to decide which kinds of systems should be built and evaluated 
subjectively. The subjective test is the final arbiter in determining the 
usefulness of DPCM for the transmission of television signals. 

Unless otherwise stated, the term noise used in this paper implies 
quantizing noise. We are concerned here with designing DPCM encoding 
and decoding systems which minimize the mean square difference be- 
tween the decoded output signal and the analog input signal. This 
optimization is based on an analytical, 1e., objective, criterion, not a 
subjective one. Thus, the S/N ratios used are unweighted. All sampling 
is assumed to be at twice the bandwidth of the baseband input signal, 
and all the resulting quantizing noise is considered to be in-band. Sys- 
tems have been proposed? which shape the power spectrum of the 
quantizing noise to make it less objectionable to the human observer. 
This approach, however, is complicated by the difficulty in determining 
the proper weighting function for noise which is not independent of 
the signal. In most DPCM systems the quantizing noise is highly corre- 
lated with the derivative of the signal. 
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IV. DESIGN PROCEDURE 


The design procedure used herein is to first design the predictor 
ignoring the presence of the quantizer. Then the quantizer is designed 
to match the amplitude distribution of the signal coming from the 
subtractor. This procedure will result in a system which is very nearly 
optimum because when the number of quantizing levels is large, the 
inclusion of the quantizer in the circuit has very little effect on the 
amplitude distribution of the signal coming out of the subtractor. The 
predictor will be restricted to be a linear time invariant device and the 
theory of linear prediction will be used to optimize it. The quantizer 
will be designed in accordance with procedures first proposed by Panter 
and Dite.! 


Vv. THE PREDICTOR 


It is true that nonlinear prediction is superior, by the S/N ratio 
criterion, to linear prediction for television signals. It has never been 
determined, however, just how much the S/N ratio can be improved 
by using nonlinear prediction techniques. Graham® suggested one non- 
linear predictor and simulated it on the computer. Fine! discusses the 
general case where both nonlinear prediction and quantization are 
allowed. In this paper, however, only linear prediction is used. 


5.1 Theory of Linear Prediction 


The following brief explanation of the procedure of linear prediction 
is based on the terse exposition of this subject given by Papoulis.” 

Let a stationary signal S(¢) with mean 0 and rms value o be sampled at 
the times ¢,, f2, -°: , f, °°: and let the sample values be S,, S2,---, 
Snr, +°+ , respectively. 

A linear estimate of the next sample value Sp based on the previous 
n sample values S,, S2, °°: , Sn is defined to be 


So = 081 + aeSe + +++ + anS,. (1) 


For simplicity, we assume here that the a’s and S’s are real numbers. 
A linear predictive encoder forms this estimate Sp and transmits the 
difference or error 


& = So = So. (2) 


A block diagram of such a system is shown in Fig. 2. The D’s represent 
delay elements, 
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Fig. 2 — Block diagram of a linear predictive encoder. 


We define the best estimate of So to be that value of 8 for which 
the expected value of the squared error is minimum. To find the values 
of the a’s which satisfy this condition we first take the partial deriva- 
tives of E[(So ~ S))?] with respect to each one of the a’s. E[z] denotes 
the expected value of zx. 


SE[(So — So)"] _ SH[(So — (aiS: + aS2 + +++ + dnSn))’l 


6Q; 6a; 
= —2E[(So — (aS: + aSe + +++ + anSn)) Si] 
ee eee 


{ 
To find an extremum, in this case a minimum, we set this equal to 
zero giving 


E[(So — (aiSi + a2S2 + +++ + an8n)) Si] 
El(So — &0) Si] 

If we represent the covariance of S; and S; by 
Riz = ELS;Sy], (4) 


then from (8) we can rewrite the conditions for the best linear mean 
square estimate as 


Ro: = aki; i doko; rusts Se OnRni i= 1, 2, ney. (5) 


I 


0 


3 
0 @=1,2,--+,n. 8) 
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Equation (5) defines a set of n simultaneous linear equations in the n 
unknowns a;, 7 = 1, 2,---,m, which can be found if the covariances 
R,; are known. These covariances are found from the autocovariance 
W(r) of the signal itself, 


Ris = W(t — ty). (6) 


If So is the best linear mean square estimate of So, then the expected 
value of the square of the error signal é is 


o2 = El(S) — &)*] = El(So — So) Sol 


of = Ro — (aiRor + aghog -+- + GnRon). (7) 

In (7), Roo is simply the variance o? of the original sequence Sy, S15 
- = {Sy}. 

The sequence of transmitted error samples is ¢ , €1, --: = {e;} where 

e=S-8 ¢=0,1,--:, (8) 


and 
8; = OQ Siz + Size +++ 1 OnSiyn . 


The error sequence {e;} is less correlated and has smaller variance than 
the signal sequence {S;}. The use of linear prediction has produced a 
sequence {e;} from which the sequence {S;} can be reconstructed. The 
variance o,? of the error sequence {e;} is less than the variance of the 
original sequence {S;} by the amount shown in the parenthesis in (7). 
If the number of samples n used in forming the estimate is unlimited, 
then the sequence of error samples can always be made completely 
uncorrelated. If the sequence of samples S), S:,--- = {S;} is an rth 
order Markoff sequence, then only r samples need be used in forming 
the best estimate of Sp and the resulting sequence of error samples will 
be uncorrelated. 

As an example of particular relevance to television, consider the Ist 
order Markoff sequence formed by sampling a signal whose autocorrela- 
tion is the exponential function e*’. In this case, even if all previous 
sample values are available, the estimate of S) which minimizes o,? is 
So = (Ru/o?)S: where S, is the most recent sample value available. 
It is easy to show that, in this case, the error sequence {e;} is completely 
uncorrelated, i.e., 


Elee;| = 0 tj 


PREDICTIVE QUANTIZING SYSTEMS 697 


The autocorrelation function of one line of a television signal is very 
similar to e ~’ so in this case we expect that basing our estimate only 
on the previous sample value will be almost as good as using many 
sample values on the same line. It will be shown, however, that, if we 
have access to sample values on the adjacent line and/or on the previous 
frame, we can improve our prediction. 


5.2 Application to Television Signals 


The samples Si, S2,--*, S, used in (1) to form the estimate 
need not be the most recently transmitted ones and they need not be 
in any particular order. They are simply » sample values which have 
been transmitted in the past. Fig. 3 illustrates 7 sample values which 
can be used to form a reasonably good estimate of So. Such an estimate 
would be 


So = a8: + agS_ + a3S3 + a4Ss + asSs + aS + aS, (9) 


where the a’s are chosen to satisfy (5). Also shown in the figure, are 
covariances between these samples. 

It will be shown that there is little advantage in using samples S; 
through S, for, once S; and S2 are used in the prediction, the other five 


S S S S S 
Ros Ros Roa Roa Roz 
SCENE A 0.629 0.756 0.868 0.758 0.618 
SCENE B 0.762 0.813 0.901 0.796 0.763 
SCENE C 0.814 0.905 0.960 0.919 0.829 
Ss S; So 
Se OO 
Ros Ror Roo 
0.631 0.803 1.0 
0.774 0.816 1.0 
0.832 0.934 1.0 


Fig. 3 — Television scan showing sample values near Sp, . Covariances for the 
three scenes in Fig. 6 are also shown. 
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samples contain little additional information about So. Most DPCM 
systems built in the past use only the previous sample S, and form the 
estimate 


So = QS1 . 


In this simple case, it is clear from (5) that the constant a, should be 
Rou /o2, the covariance between adjacent sample points divided by the 
mean square value of the input sequence. DPCM systems of this type 
are called previous-sample feedback systems, and a block diagram of 
the predictor used in such a system is shown in Fig. 4. 

A DPCM system which forms its estimate of So by using the previous 
sample S; and the adjacent sample on the previous line S, will be called 
a line-and-sample feedback system. In this case, 


8, = a81 -- 2S : (11) 


A block diagram of the predictor for this system is shown in Fig. 5. 

This concept can easily be extended to take advantage of frame-to- 
frame correlation. A frame-line-and-sample feedback system would 
form its estimate of the next sample value by 

So = aS; + a2Se + ay Sy (12) 

where S; is the sample value which is equivalent to Sp) but on the previ- 
ous frame. IFrame feedback systems are not considered in detail in this 
paper primarily because statistics of frame-to-frame correlations are 
not available. 


5.3 Statistics of Television Signals 


In order to determine some statistics of television signals and to use 
television signals as inputs to DPCM systems simulated on the IBM 
7094 digital computer, some television signals were obtained from a 
slow-speed flying-spot scanner.® These signals were sampled and en- 
coded into 11 bit PCM and placed on a magnetic tape suitable as an 


TO PREDICTOR | 
SUBTRACTOR a 
(SEE FIG.1) 


QUANTIZER 
(SEE FIG.1) 










1! SAMPLE 
DELAY D 






Fig. 4 — Previous-sample predictor. 
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to FROM 
SUBTRACTOR QUANTIZER 
(SEE FIG.1) (SEE FIG.1) 






{ LINE 
DELAY Dy 





Fig. 5 — Previous line-and-sample predictor. 


input to the computer. The signals were obtained by scanning the three 
square slides shown in Fig. 6 and represent only one frame of a television 
signal. In conformity with television practices, the video signal was a 
function of the 0.4 power of the brightness of the original scene. The 
standards used gave 100 lines and 100 samples per line for the visible 
part of the pictures and all samples taken were on a symmetric lattice 
or grid. 

The signals on magnetic tape were composite signals, i.e., they con- 
sisted of the video signal and a train of syne pulses. Noise and distortions 
in the syne pulses do not govern the quality of a television signal as 
long as synchronization is maintained. Therefore, DPCM systems 
should be matched to the statistics of the video part alone. For this 
reason, the horizontal syne pulses were ignored and the autocovariance 
functions of the video part of the signals were obtained. For convenience, 
the signals were first normalized so that the rms value o of the video 
was 1 and its mean value was 0. The autocovariance functions y(7) 
are shown in Fig. 7. For small values of 7, these functions are very 
similar to exponential functions. Since we are dealing with sample values 
rather than with continuous signals, the autocovariance is actually a 
set of points at integer values of the lag 7 and these points represent 
the values of Ro;, 7 = 0,1, --- . Fig. 7 was constructed by finding these 
points and drawing lines between them. This is also true of Figs. 8 and 
16. The peaks at + = 100 are due, of course, to the high correlation 
between adjacent lines of the television signals. Correlations between 
adjacent frames are not illustrated in the figure because the signals 
used represented only one frame of a television signal. 

Tig. 3 illustrates some of the covariances between neighboring points 
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(C) 


Fig. 6 — Pictures of three slides scanned to obtain television signals. 


for the three scenes. For example, the covariance between points Sp 
and S, in scene B is Ruy = 0.796. The three pictures used had higher 
vertical than horizontal correlation. 

A transmission system is useful only if it can satisfactorily transmit 
a vast ensemble of signals and its performance must be judged on the 
basis of its ability to transmit almost all members of this ensemble. 
The statistics we use here have been obtained from only three members 
of this ensemble and, since the members of this ensemble are derived 
from a nonergodic process, we cannot obtain the statistics of the ensem- 
ble by examination of these three members. Nevertheless, it is useful 
to determine the design and performance of DPCM systems when used 
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to transmit these members which, in some sense at least, are representa- 
tive of the whole ensemble. 

The autocovariance functions in Fig. 7 are averages over the time 
for each of the three signals used. The autocovariance function of the 
random process from which these three signals could be derived could 
not be determined here. Franks, however, has proposed a model for 
this random process in which the autocovariance function of the picture 
material is exponential in both the horizontal and vertical directions. 
Data obtained in this study, some of which is illustrated in Fig. 7, 
indicates that this is a good approximation for the three scenes used here. 


5.4 Linear Predictors 


Using the data in Fig. 3, we can solve (5) and (7) for the a; and o, 
for several practical linear predictors. Table I illustrates the optimum 
values of the a’s and the resulting mean square error signals for 8 differ- 
ent predictors. The relative positions of the sample values in this table 
are those of Fig. 3. For example, if the prediction of So is based on the 
three sample values S;, Sz, and iS, (predictor number 6) then the linear 


SCENE A 
—--- SCENE B 
—w— SCENE C 





AUTOCOVARIANCE 





























Fig. 7 — Autocovariances of the 100 line, 100 samples per line video signals 
obtained from the three scenes of Fig. 6. (The syne pulses were not included in 
computing these functions.) 
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predictor giving the smallest value of o, for scene C forms §) by the 
equation 


So = 0.383 Si + 0.362 S. + 0.263 Sy. 


When no quantization is present, such a system results in an error signal 
whose rms value o;, is 0.230. This is 12.8 db below the rms value of the 
signal itself whose rms value o is 1. The transmitted error sequence will 
be much less correlated than the original signal sequence. We may think 
of this signal processing as the removal of redundancy, in this case, 
12.8 db of redundancy. 

Examination of Table I reveals that once samples S; and S_ have 
been used in the prediction, there is little advantage in using any others. 
This means that samples Si and S: provide almost all of the informa- 
tion about sample Sp which can be obtained from the previous samples. 
Samples 8; and S, contain almost no additional information. For a 
system with line feedback (a system which can store and, therefore, has 
access to the previous line), there is little point in using any samples 
but Si and S.. Similarly, for a system without line feedback there is 
little point in using any sample but S; to predict So. Furthermore, for 
the pictures tested, line feedback itself provides only about a 3-db 
improvement in the estimate of S,. This is somewhat disappointing 
especially in view of the fact that the scenes tested had higher vertical 
than horizontal correlation. Exactly what can be obtained from frame 
feedback must await the availability of frame-to-frame covariance 
statistics. Although the above conclusions about line feedback are based 
on statistics obtained from some 100 line and 100 samples per line pic- 
tures it will be shown in section VIII that they apply to television sys- 
tems in general. 

This study suggests that the sequence of sample values derived from 
one frame of a television signal (or a facsimile signal) may be approxi- 
mated well by a second-order Markoff sequence.* Furthermore, studies 
by Deriugin” indicate that the sequence derived from many frames of 
a typical television signal may be, approximately, a third-order Markoff 
sequence.* In this case, the state (value) of the next sample So may be 
statistically dependent only on S,, S, and S,, where these are the 
sample values adjacent to So on the same line, the previous line, and 
the previous frame, respectively. More work is required to determine 
just how well Markoff sequences can represent sample values of tele- 
vision signals. 

— * This might more properly be called a distant second (or third) order Markoff 


sequence because, although S; is the previous sample value, there are many 
intervening samples between S,, S:, and Sp 
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5.5 Computer Simulation of Predictors 


In order to determine how effectively redundancy could be removed 
from a television signal by using prediction, the predictors number 
1, 3, and 8 shown in Table I were simulated on the computer for all 
three scenes. The actual rms value o, of the errors in the prediction 
agreed well with those shown in Table I. The autocovariances of the 
error signals were found and for scene C they are illustrated in Fig. 8. 
The autocovariance functions shown in Fig. 8 are also representative 
of what was found for scenes A and B. 

Figs. 9 and 10 show the amplitude distribution of the error sequence 


TABLE I — VALUES OF THE AMPLIFIER GAINS AND RMS PREDICTION 
ERROR FoR 8 PREDICTORS MATCHED TO THE 3 PICTURES OF 


























Fic. 6 
zi eee phere aus 
; t 
tured " Brediction 2 Scene Eg oe a a2 a3 a4 as 
ge |—20 loga, 
A |0.597| 4.5 | 0.803 
1 Si B_ |0.578} =4.8 | 0.816 
(see Fig. 4) | C 10.358} 8.9 | 0.934 
A |0.498 6.1 0.868 
2 Se B_ |0.434 7.2 0.901 
C {0.279} 10.1 0.960 
A |0.444 7.0 | 0.341) 0.610 
3 Si Se B_ |0.402} 7.9 | 0.270) 0.686 
(see Fig. 5) C |0.247; 12.1 0.333] 0.654 
A 10.595 4.5 0.834 —0.039 
4 81,85 B |0.547 5.2 0.552 0.324 
C {0.339} 9.4 1.229 —0.316 
A |0.494 6.1 0.541 0.423 
5 S184 B_ /0.512 5.8 0.499 0.415 
C \0.246; 12.2 0.550 0.463 
A (0.4438 7.1 0.337) 0.481 0.163 
6 Si S284 B_ |0.398; 8.0 | 0.238} 0.629 0.101 
C 10.230} 12.8 0.383] 0.362 0.263 
A j0.489 7.2 0.432} 0.660/—0.149 
7 S1,S283 B_ \0.401 7.9 0.227) 0.670) 0.062 
C |0.224| 13.0 0.606} 0.793|/—0.417 
A |0.429 7.3 0.419] 0.533)/—0.134] 0.155 
8 S1 S23 S4 B_ 10.398 8.0 0.210} 0.620, 0.047] 0.097 
C (0.214) 13.4 0.598) 0.544/—0.346] 0.203 

















* The rms value o of the input signal is 1. This table is concerned only with 
prediction error and does not consider the eff ects of quantization. 
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Fig. 8 — Autocovariance of error sequence {e;} of scene C for three linear pre- 
poe matched to this scene. (The samples S: , S2, S;, and S4 are defined in 
‘ig. 3.) 


for the three pictures using previous-sample prediction (predictor num- 
ber 1 in Table I), and line-and-sample prediction (predictor number 3 
in Table I), respectively. The shape of these density functions is of 
foremost importance in designing an optimum quantizer. In both figures 
the density functions can be approximated reasonably well by La- 
placian functions. These amplitude density functions were found by 
dividing the range +40 into 25 equal intervals and finding the number 
of sample values in each interval. The points so found were normalized 
and curves drawn between them. 


VI. THE QUANTIZER 


In analog systems, it is difficult to evaluate the wisdom in reducing 
the power by removing the redundancy from a signal. For this process 
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SCENE A , 4,= 0.803 
0.9 ——-—— SCENE B , a,=0.816 
—-— SCENE C , a;=0.934 
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AMPLITUDE IN STANDARD DEVIATIONS 


Fig. 9 — Probability density for amplitude of error sequence {e,;} for scenes 
A, B, and C using previous-sample prediction, So = ais:. (The value of a1 is 
chosen to match each scene.) 


automatically makes a signal more susceptible to noise in the trans- 
mission medium. While this is still true in digital systems, as will be 
shown later, we are assured both by logic!® and by experience” that 
the errors in properly designed digital systems can be made small enough 
to be neglected. And, if we can ignore this transmission noise, ie., 
assume that the probability of error in a digital system can be made as 
small as we like, there is a dividend in reducing the rms value of the 
signal to be transmitted. In fact, it will be shown that (for the signals 
used here, at least) reducing the rms value of the transmitted signal 
from o to o, decreases the rms value of the quantizing noise by a factor 
a/ Oe. 

If the input to the quantizer in Fig. 1 is eo, then its output is é& + qo 
where go is the quantizing noise. Since the receiver forms the decoded 
output by adding e) + qo to the estimate 8, the quantizing noise in 
the decoded signal is also gq. Minimizing the quantizing noise in the 
decoded output, therefore, is equivalent to minimizing the rms value 
of the quantizing noise coming out of the quantizer. This method of 
minimizing the quantizing noise was recognized independently by 
Nitadori.® 
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SCENE A, a,=0.341 
As= 0.610 


——=—— SCENE B, a;=0.270 
A2=0.686 


—— SCENE C, a; =0.333 
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Fig. 10 — Probability density for amplitude of error sequence {e;} for scenes 


A, B, and C using previous line-and-sample prediction, So = aisi + a8. (The 
values of a; and a2 are chosen to match each scene.) 


In what follows, approximations are made which apply when the 
number of quantizing levels N = 2” is large. Figs. 12 and 13, to be 
discussed later, illustrate that the S/N formulas for DPCM systems 
are accurate for small N as well. It is possible, nevertheless, that in- 
accuracies may occur under certain conditions when WN is too small. 
If N is less than about 8, the formulas and design procedures presented 
here should be used with caution. 


6.1 Optimum Quantization 


For n bit quantization, each member of the error sequence is made to 
assume one of N = 2” different levels. It has long been known that 
nonuniform quantization is generally preferable to uniform quantization 
in DPCM systems. Panter and Dite have shown that the minimum 
mean square quantizing error is given by 


; 9 Vv : 3 
Px - ani f P¥(e) ae . (13) 
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where P(e) is an even function representing the probability density 
of the input to the quantizer and P(e) is zero outside the interval 
(—V,V) which represents the range of the quantizer input. 

The curves for P(e) shown in Figs. 9 and 10 may be approximated 
reasonably well by the Laplacian density function 


P(e) = a 5, oMP (-2 |e ) ; (14) 


where o, is the rms value of the quantizer input. Since the amplitude 
density function is different for each scene to be transmitted, the best 
we can do is to choose some representative density function and match 
the quantizer to it. We choose this function to be the exponential of 
(14) and we feel that this will give results which can be expected in 
practice. Although Figs. 9 and 10 are plots of the error signal without 
a quantizer in the circuit, computer simulations with the quantizer in 
the circuit showed that these amplitude density functions are effected 
very little by the addition of the quantizer as long as the number of 
levels N was greater than 4. Solving the integral in (13) for this P(e) 
and taking the limit as V gets large gives, as an approximation for the 
mean square value of the quantizing noise, 


2 9 2 
~~ 2N2°*" 





Tq (15) 
Refining this approximation by using the actual value of V changes it 
very little since, for cases of interest in DPCM, V, which is the peak-to- 
peak value of the input signal S(¢), is always large compared to o, . For 
the three scenes considered here V is about 7 times the rms value o of 
the signal S(t), and o, is generally much less than co. For small values 
of V the density function in (14) must be truncated. This changes and 
complicates the value of o,? given in (15). 

From (15) we see that if the rms value in the input video signal is o 
then the rms 8/N ratio in the video (considering only quantizing noise) 
of a decoded television signal transmitted through a well-designed n 
digit differential PCM system is (in db) 





2 2 
S/N = 10 log 2N2 
9e. 
2 
S/N & —65 + 6n + 10 log. (16) 


Equation (16) gives the ratio of the rms video signal to rms quantizing 
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noise in the video. A bound on the S/N ratio to be presented later dif- 

fers from (16) only by a constant and suggests that this S/N ratio is 

within about 5 db of that possible for any encoding system. To convert 

this S/N to the more useful measure, peak-to-peak composite signal 

to rms noise in the video, we must add a constant to (16) giving (in db) 
2 


S/N & —65-+C + 6n+ 10 log +, 
0, 


where C is the ratio in db of peak-to-peak composite signal to the rms 
value of the video. The value of C' is determined by the peak value of 
the sync pulses. It is also dependent on picture material and upon such 
apparently extraneous factors as the man or electronic device which 
regulates the peak values of the video signal. For FCC standard mono- 
chrome entertainment television some measurements, as well as some 
data derived from the flying spot scanner used for these studies, indicate 
that the rms value of the video is about one tenth the peak value of 
the composite signal and the value of C is, therefore, about 20 db.* 
Actual measurements of the signals derived from scenes A, B, and C 
of Fig. 6 give 20.0, 19.8, and 18.1 db, respectively, for the value of C 
(assuming ICC syne standards). An approximation, then, for the ratio 
of peak-to-peak composite signal to rms quantizing noise in the video 
for a typical ICC standard television signal is (in db) 


2 


S/N & 13.5 + 6n + 10 log at (17) 


Bennett” showed that if the input signal is distributed evenly between 
the quantizing levels, the rms value of the quantizing noise for standard 
PCM is Ep/+/12. Eo is the step size of the uniform quantizer and 
Ey = Voeax/2", where Vpeak is the peak value of the signal to be encoded 
and n is the number of quantizing digits. Therefore, the peak-to-peak 
composite signal to rms noise ratio for standard PCM is (in db) 


S/N = 20 logyv/12 + 20 log 2” 
= 10.8 + 6n. (18) 


If the syne pulses could be reconstructed by the decoder, then all the 
PCM levels could be used for the video and the constant in (18) would 
be 20 log (4/12/0.072) = 13.6. In other words, if the syne pulses need not 


* Some unpublished studies by J. W. Smith indicate that for systems which 
have automatic regulation of the peak signal excursions the constant C may be 
several db less than this. Such systems, in attempting to determine peak white 
and peak black, introduce a certain amount of clipping. 
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be transmitted then the ratio of peak-to-peak composite signal to rms 
noise in the video for standard PCM becomes (in db) 


S/N & 13.6 + 6n. (19) 


Transmitting the sync lowers the S/N by 2.8 db for PCM. As one might 
expect, provided we neglect the sync pulses, the S/N ratios for standard 
PCM and differential PCM can be approximated by the same expres- 
sion, namely (17). Since the constant in (17) is somewhat arbitrary, it 
would be easy to justify making it 13.6 to agree with (19). In standard 
PCM, there is no feedback loop and the estimate of the sample value 
So based on previous sample values is simply 0, the mean value of the 
input sequence. In this trivial case, since ¢ = o, the DPCM system 
becomes identical to standard PCM and (17) reduces approximately 
to (19). Therefore, we may consider standard PCM to be a special 
case of DPCM which is optimum when all the covariances R;; for 1 ¥ J, 
are zero. 

When the feedback loop exists and when the amplifier gain(s) are 
reasonably large, then the DPCM system can adequately encode the 
syne pulses as well as the video. However, when the amplifier gain(s) 
are too small, or when the feedback loop is not provided at all, as in 
standard PCM, then either the decoder must be arranged to reconstruct 
the syne pulses, or the range of the quantizer must be increased beyond 
what is required for the video in order to accommodate the sync pulses. 

From (5) and (7), we can express o, in terms of the covariances F,; . 
For the simplest case, the previous-sample feedback system, the peak-to- 
peak composite signal to rms quantizing noise in the video S/N ratio 
can be expressed as (in db) 

2 


S/N = -6.5 + C + 6n + 10 log = (20) 
oT 


o 
— Ro?/o?” 
This equation illustrates that when o/c? is close to 1, doubling the 
bandwidth and the sampling rate (this doubles the horizontal resolu- 
tion), which is roughly equivalent to halving the value of o? — Ro1/o?, 
increases the S/N ratio by about 3 db. 


6.2 Design of the Quantizer 


One way to obtain the proper quantizer levels for minimizing the 
rms quantizing noise is to form a function y(z) such that when z takes 
on uniformly spaced levels between —V and V, y assumes the proper 
quantizing levels. Smith” shows that, when the probability density 
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of the signal to be quantized is that of (14), the function y(z) is given by 


I 


y(z) —* in E — = ( — exp(—m)) | ; O<z 


(21)* 
y(—z) = y(z), 
where 


m= V2V/30. 


There are more elegant and exact ways for finding the quantizing 
levels,”:? but it is doubtful if they can be incorporated into practical 
systems. Turthermore, it is unlikely that these more sophisticated 
techniques offer a significant decrease in the quantizing noise over what 
can be obtained by the simple quantizers described here. 

Smith studied quantizers with the characteristic of (21) in some detail 
for the application to standard PCM systems for speech. His rejection 
of this characteristic in favor of another results primarily from the wide 
variation of talker volumes present in speech channels. This objection 
does not apply to television channels whose signal levels are relatively 
constant. 

A typical 8 level quantizer designed by using the characteristic of 
(21) is shown in Fig. 11. The case shown is for V = 7 and m = 5.5. 
The output signal always assumes the quantizing level nearest to the 
input signal. Overload noise, which occurs when the signal to be quan- 
tized is outside the range of the quantizer (2.61 in Fig. 11), is a part 
of the quantizing noise which is minimized here. It must be considered 
separately only when the range of the quantizer is so small that over- 
load causes a significant alteration in the probability density function 
of (14). 


VII. COMPUTER SIMULATION OF DPCM SYSTEMS 


The results of computer simulations verify that systems designed by 
the procedures presented here do function as predicted. 

By applying the principles outlined herein, the parameters for some 
DPCM systems were determined and these systems were simulated on 
the IBM 7094 digital computer The input signals used were the 100 
line, 100 samples per line television pictures obtained from the scenes 
of Fig. 6. Section 5.3 contains a description of how these signals were 
obtained. The results of the simulation are shown in Figs. 12 and 13. 
The S/N ratios in these figures are ratios of rms video signal to rms 


* This is the inversion of (A-6) of Ref. 21. 
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OUTPUT 





INPUT 


Fig. 11 — Typical 8-level exponential quantizer characteristic obtained from 
(21). “Gane shown is for m = 5.5 and V = 7.) 


noise in the video. To get the ratio of peak-to-peak composite signal 
to rms noise in the video from these curves, we must add C, the ratio 
in db of peak-to-peak composite signal to rms video. For scenes A, B, 
and C this value of C is 20.0, 19.8, and 18.1, respectively (assuming 
FCC syne standards). Also plotted in these figures are curves of S/N 
ratio which were predicted for these systems using (16). Fig. 12 gives 
the results for previous-sample feedback systems (predictor number 1 
in Table I), and Fig. 13 presents results for line-and-sample feedback 
systems (predictor number 3 in Table I). Both figures illustrate the 
performance of systems whose predictors are tailored to the incoming 
signal. Table I illustrates that the use of more complicated predictive 
systems, using samples S;, S;, and S; in addition to S; and S:, does 
not significantly lower the rms error in the prediction. Furthermore, 
the optimum designs of predictors 6, 7, and 8, which give only a slight 
decrease in the rms error, are radically different for scenes A, B, and C. 
A system using predictor 6, 7, or 8 designed to give good performance 
for scene B is likely to give poor performance for scenes A and C. This 
is not the case with predictors 1 and 3 which were simulated. To verify 
this, previous-sample feedback DPCM systems were simulated for 4, 5, 
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Fig. 12 — Ratio of rms video to rms quantizing noise in the video for systems 
matched to each scene using previous-sample feedback DPCM. (Theoretical 
iat from (16) [straight lines] are compared with results of computer simula- 
tion. 


and 6-bit encoding with a, = 0.815, m = 6 and V = 7. The signals 
from all three scenes were used as inputs and the results were almost 
identical to those shown in Fig. 12. Similarly, line-and-sample feedback 
DPCM systems were simulated for 4, 5, and 6-bit encoding with a, = 
0.315, ag = 0.650, m = 8 and V = 7, and the three signals all produced 
S/N ratios essentially the same as those in Fig. 18. The parameters in 
these two DPCM systems are not critical and need not be exactly 
matched to the picture material from which the incoming signals were 
obtained. 


VIII. THE MARGINAL UTILITY OF LINE FEEDBACK 


For the 100 by 100 matrix pictures used in this study, the use of line 
feedback increased the 8/N ratio by only about 2 or 3 db. This increase 
is small because the sample values S; and Sz contain substantially the 
same information about So, the sample value to be predicted. And, 
once S; has been used in the prediction, there is only a 2 or 3 db ad- 
vantage in simultaneously using Se in the prediction. 

To illustrate this point, consider a scene whose contours of equal 
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Fig. 13 — Ratio of rms video to rms quantizing noise in the video for systems 
matched to each scene using line-and-sample feedback DPCM. (Theoretical re- 
sults from (16) [straight lines] are compared with results of computer simulation.) 


autocovariance are circles. Further assume that the autocovariance 
between any two points, S; and S;, separated by a distance D can be 
expressed as f,;; = o?e-*?, Both of these assumptions are reasonable 
ones for television picture material. If 8, (the previous sample) and S. 
(the adjacent sample on the previous line) are equidistant from So, 
then Ra = Ro and Ry = o?(Ru/o?)V2. From (5) the values of the 
coefficients a; and a2 are 


Ro 
a = a= o 7 a (Ru/o) V2 (22) 
or 8 aay 
f= lpia): oe 


Compare this with the mean square value of the prediction error when 
only S, is used in the prediction 


of = 0 — Rot?/o’. (24) 
In Fig. 14, the advantage to be gained in a DPCM system by pro- 
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Fig. 14— Comparison of line-and-sample feedback system and _ previous- 
dGeple feedback system when Ro = Roz. (To get actual peak composite signal 
to rms noise ratios in an n-digit television system, add 13.5 + 6n db.) 


viding a line-and-sample feedback predictor is compared with that of a 
simple previous-sample predictor. This figure applies to sequentially 
scanned television systems in which the covariance between adjacent 
samples on the same line is equal to the covariance between neighboring 
samples on adjacent lines, i.e., Ro = Ro. In television systems using 
interlace, the S/N ratio improvement provided by using line feedback 
in addition to sample feedback will be even less. The two curves in 
this figure are simply plots of 10 log o?/c.2 where the value of o,” is given 
by (23) for the line-and-sample feedback system, and by (24) for the 
previous-sample feedback system. From (17) we see that the term 
log o?/c2 represents the S/N improvement to be expected from the 
feedback loop. The two curves in Fig. 14 then show the value of the 
feedback loops for the two predictors of interest. The distance between 
the two curves is the improvement provided by the line-and-sample 
feedback system over the previous-sample feedback system. It can be 
shown that the maximum value of this improvement approaches about 
1.9 db and this occurs as Ro /o? approaches 1. In other words, in television 
signals whose samples have the same covariance in the horizontal direc- 
tion as in the vertical direction, a line-and-sample feedback system can 
provide, at best, only 1.9-db improvement in S/N ratio over a simple 
previous-sample feedback system. 

For the scenes used in this simulation, the line feedback loop provided 
S/N ratio improvements between 2 and 3 db. This was more than the 
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1.9 db maximum because the scenes used had higher vertical than 
horizontal correlation. When line-and-sample feedback DPCM is used, 
sequentially scanning a scene can provide as much as, but no more than, 
3-db improvement in S/N ratio over 2:1 interlaced scanning with the 
same number of lines. This is true because the value of 1 — (Ro2/a?) 
for the sequential scanning is about half of what it would be for inter- 
laced scanning. Exactly how much improvement is afforded by se- 
quential scanning depends on the values of Ru, Roo, and Ry which 
are determined by the scene scanned as well as by the television stand- 
ards used. 

The lower curve in Fig. 14 can also be used to predict the advantage 
to be gained by frame fecdback DPCM. In this case, the abscissa would 
be Ror/o? where Ror is the covariance between equivalent points on 
adjacent frames. The S/N ratio for a frame feedback system is given by 
(20) if Ro: is replaced by Roy . Some measurements by Kretzmer® and 
Deriugin™ suggest that Ror/o? may, in general, be less than Ro /o?. This 
implies that frame feedback may be of little value in reducing the 8/N 
ratio in DPCM systems. 


Ix. DPCM FOR MONOCHROME ENTERTAINMENT TELEVISION 


In 4.5-Me/s entertainment black and white television there is little 
advantage in basing the prediction on any sample values except the 
previous one, unless, of course, sample values from previous fields are 
available. For this previous-sample feedback system the approximate 
value of the S/N ratio to be expected is given in (20). Ina 525-line pic- 
ture at a frame rate of 30 per second, sampling at twice the bandwidth 
or 9 Mc/s means that there are about 571 samples per line. Only 83 
per cent of these, or 474, occur in the video while the others occur during 
the horizontal and vertical syne pulses. 

Using simple linear interpolation* we see that, if scene A of Fig. 6 were 
sampled at 9 Mc/s, the covariance between adjacent points would be 
Rau & 0.958. For scene C, Ro & 0.986. Using these numbers and the 
appropriate values of the constant C in (20), and remembering that 
a? = 1 for these signals, the S/N ratios to be expected for transmitting 
these scenes over a DPCM channel can be found. The S/N ratio for 
scene B would fall somewhere between those of scenes A and C. These 
S/N ratios are compared with standard PCM and delta modulation 
in Fig. 15. For the lower bit rates, these curves must be used with dis- 
cretion. The line representing the PCM performance is simply a plot 
of (18) which assumes that the sync is transmitted. The PCM S/N 


* Since the aspect ratio is 4:3 we could not transmit these pictures as they are. 
We assume here that either the top or bottom } of the pictures is not transmitted. 
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Fig. 15 — Comparison of previous-sample feedback DPCM with standard 
PCM and delta modulation for monochrome 4.5-Mc/s entertainment television. 
(Peak-to-peak composite signal to rms quantizing noise in the video is the signal- 
to-noise ratio shown. The sampling rate is 9 Mc/s.) 


ratios can be increased by 2.8 db if the syne is not transmitted. The 
delta modulation S/N ratios were found by an entirely different tech- 
nique” and it is gratifying that they are reasonably consistent with the 
results found here for 1 digit DPCM, which of course, is identical to 
delta modulation with a sampling rate of twice the bandwidth. 

Tig. 15 shows that, for a fixed bit rate, DPCM would give a 14-db 
improvement over standard PCM for scene A and an 16.8-db improve- 
ment for scene C. The advantage of DPCM can also be expressed in 
terms of bit rate. For a given S/N ratio, DPCM gives a reduction in 
bit rate over standard PCM of about 18 megabits (2 bits/sample). 
Since the sampling rate for DPCM and PCM is assumed to be twice the 
bandwidth or 9 Mc/s, these curves in Fig. 15 are actually defined only 
at multiples of 9 megabits. 


X. THE CHARACTER OF THE QUANTIZING NOISE 


Fig. 16 illustrates the autocovariance for the quantizing noise in a 
previous-sample feedback system. The cases shown are for scene B 
but these curves are typical of all three scenes. The exact autocovariance 
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Fig. 16 — Autocovariance of quantizing noise for N = 2, 8, and 64 level previ- 
ous-sample feedback DPCM. (Case shown is for scene B with the DPCM system 
optimized for this scene.) 


function of the quantizing noise depends on the picture material of the 
incoming signal. These autocovariance functions were essentially the 
same when the parameters of the DPCM system did not exactly match 
the statistics of the incoming signal. The spectra of the quantizing noise 
in the three scenes is found by taking the Fourier transforms of their 
autocovariance functions. These spectra were found to be relatively flat 
for both previous-sample feedback and line-and-sample feedback sys- 
tems. In both cases, there were erratic peaks and valleys at multiples 
of the line rate but the peaks and valleys differed from each other 
only by about 2 to 4 db, these differences being slightly greater for the 
previous-sample feedback system than for the line-and-sample feedback 
systems. In neither case did the spectra show any general tendency to 
increase or decrease for higher frequencies. Fig. 16 illustrates that the 
correlations between sample values is quite weak. The usual assumption 
of flat quantizing noise in DPCM systems is probably a good one for 
most purposes. 
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Fig. 17 shows a plot of the amplitude density function of the quantizing 
noise for scene B. It is typical of all the scenes that the amplitude density 
is relatively flat for a small number of quantizing levels N and becomes 
more Gaussian shaped as N gets large. In all cases, however, even when 
N was 64, the amplitude density function was flatter than Gaussian. 
Tor all three scenes with N = 2 the quantizing noise amplitude density 
function had a dip near zero. 


XI. THE PENALTY 


Removing the redundancy from the transmitted signal has the dis- 
advantage that the signal becomes more vulnerable to noise introduced 
in the medium of transmission. This is true of predictive systems, in 
general, whether or not they are digital. A technique for reducing the 
redundancy in analog television signals by linear filtering has been 
proposed by Franks.'4 The similarity between this analog technique 
and DPCM is apparent. The utility of digital transmission itself is 
simply that it provides a desirable trade of bandwidth for noise im- 
munity in the transmission medium. We may think of DPCM as a 
counter-trade. For a given amount of quantizing noise, DPCM allows 
transmission at a lower bit rate (and therefore bandwidth) than standard 
PCM. Errors in the transmission channel, however, degrade the decoded 
DPCM signal more than they would in standard PCM. 
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Fig. 17 — Quantizing noise amplitude density functions for N = 2, 8, and 64 
level previous-sample feedback DPCM. (Case shown is for scene B with the 
DPCM system optimized for this scene.) 
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The decoder in a DPCM system is a linear device which operates on 
an incoming sequence with rms value oc, to produce a decoded output 
with rms value c. Just as the decoder increases the level of the incoming 
signal by 20 log o/c, = k db, it will also increase the level of aecompany- 
ing noise (caused by digit errors in the transmission channel) by k db. 

This is easily illustrated by considering the decoder of the previous- 
sample feedback system with the predictor shown in Iig. 4. Noise 7, 
caused by a digit error, on a member of the incoming sequence is fed 
through the feedback loop and occurs on all subsequent samples. Such a 
noise on a single member of the incoming sequence causes the error 


sequence 7,017,d1°7,a1°n,..., in the decoded output sequence. The 
noise energy in the decoded signal is, therefore 
1 
1 =F (ayn)” oe (a.n)° ta 1 (+...) : 
= 1 


For the properly designed previous-sample feedback system a1 = Ro/o? 
and 1/(1 — Ro?/o*) = o?/o2. Therefore, a noise of energy 7? in the trans- 
mission channel appears in the decoded signal as noise with energy 7?(c?/ 
oe), a gain of 10 log o?/c.? db. 

We have already shown that DPCM provides a decrease in quantizing 
noise of about 10 log o?/c2 db over standard PCM (assuming the syne 
pulses are not transmitted). The penalty paid for this decrease in quantiz- 
ing noise is that the noise in the decoded signal introduced in the trans- 
mission medium is increased by exactly that same amount. This does 
not mean that DPCM provides no advantage. Tor, in digital systems, 
noise introduced in the transmission medium can be made extremely 
small and the limiting degradation in DPCM systems is generally 
quantizing noise. Decreasing the quantizing noise by k db may be 
desirable even if the noise introduced in the channel is increased by this 
amount. 

When the probability P of a digit error in the transmission medium is 
small enough so that the probability of getting two errors in the same 
word may be neglected, then the noise power NV; in the decoded output 
introduced by the transmission medium is directly proportional to P 
and we can express NV; (in db) as 


N,; = K, + 10 log P + 10 log o?/c-.2. (25) 
From (17) the quantizing noise N, can be expressed (in db) as 
N, = Ke — 10 log o*/c,?. (26) 


The constants K, and K, are both dependent on the number of quantiz- 
ing levels as well as other parameters. The term 10 log o?/c.? represents 
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the effect of DPCM in both equations. Reducing the quantizing noise 
N, by k db through DPCM requires increasing 10 log o?/c2 by this 
amount and this increases the noise N; introduced in the medium of 
transmission by k db. Whether or not DPCM can be used to advantage 
depends on the relative importance of N; and N, in limiting the perform- 
ance of the system. From (25) we see that if we require N; to remain 
constant while reducing the quantizing noise by k db, we must reduce 
the term 10 log P by k db. This requires reducing the value of P by a 
factor of 10°1* = (1.26)*. 
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Observed 50 to 60 Ge/s Attenuation 
for the Circular Electric Wave in 
Dielectric-Coated Cylindrical 
Waveguide Bends 


By GLENN E. CONKLIN 
(Manuscript received March 8, 1966) 


A thin low-loss dielectric coating inside a 0.875-inch 2.d. round waveguide 
was achieved by drawing a slightly oversize flexible tube of polyethylene into 
the waveguide. The plastic tube fits snugly to the guide wall. Several 14-foot 
lengths of waveguide were lined and loss measurements made. The lined 
pipes were 8 per cent lossier than the unlined. The increase in loss could be 
almost completely accounted for by the theory of H. Unger. 

A 92° Fresnel integral bend (curvature a linear function of arc length) 
was constructed from a 14-foot section of the 0.875-inch 1.d. lined waveguide. 
The insertion loss between 50 and 60 Gc/s had a value of about 0.1 db. 
This small value is close to the theoretical result for such bends. 


I. INTRODUCTION 


In a nonideal section of circular cylindrical waveguide, the circular 
electric wave TE couples to other modes of transmission which are 
permitted.! The strongest coupling for bends is to the TMi mode since 
it is degenerate in a perfectly conducting straight waveguide. The 
degeneracy of the phase factors of the TE, and TMn modes in the 
uncoated cylindrical waveguide gives rise to the serious problem of an 
increase in loss upon bending. For example, in a 2-inch i.d. bare pipe 
at a 5.4-mm wavelength, the attenuation constant is doubled when the 
radius of curvature is a few miles.! The power transfer between the two 
modes can be reduced by removing the phase degeneracy. One technique 
for doing this is to introduce a thin layer of dielectric next to the wall 
of the waveguide. The electric field intensity of the TEo: mode vanishes 
at the wall but that of the TMi mode has a large value there, hence, 
the effect on the propagation constant of the TMy mode is larger than 
on the TEo1 ; 
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The solution of the characteristic equation for the normal modes 
of the circular cylindrical waveguide with dielectric coating obtained 
by Unger and Morgan! indicates there is an increase in attenuation of 
the TE: mode attributable to dielectric loss and an added copper loss 
arising from the tendency of the electric field intensity to concentrate in 
the dielectric next to the wall. Attenuation measurements on such a 
straight length of cylindrical waveguide are reported in Section II and 
compared to theoretical predictions insofar as is possible. 

A bend for which the curvature is a linear function of bend length 
follows the Fresnel integral curve. This is the elastic bend which is 
obtained by applying forces at the center and at the two ends of a rod. 
According to Unger, a bend in a dielectric-coated waveguide with such 
a curvature has a minimum of mode conversion, hence, minimum loss.® 
Measurements on a 92° bend with the linearly tapered curvature are 
discussed in Section ITI. 


II. STRAIGHT DIELECTRIC-COATED WAVEGUIDE 


Long straight sectious of dielectric-lined waveguide were prepared by 
using the technique of pulling a slightly oversize polyethylene liner 
into an oxygen-free copper waveguide. Only a small length of the liner 
was in contact with the copper wall during the pulling. The remainder 
was stretched enough elastically to have an outside diameter less than 
the inside diameter of the pipe. The liner expanded in place against the 
waveguide wall after the tension was removed. 

The polyethylene sleeve was made by extrusion molding from a crease- 
resistant formulation of polyethylene and rubber.* The dielectric 
constant and loss tangent at 55.2 Gc/s of this material were measured® 
and are 2.25 and 1 X 10-4. This means that the material is very good 
from the dielectric attenuation point of view. A finished extrusion tended 
to have small air pockets and trapped dust particles. Another difficulty 
existed with the extrusion. The inner die happened to be slightly ec- 
centric with the outer during manufacture. A maximum wall thickness 
tmax Of 0.0134 inch was measured on one side and a minimum ftmin of 
0.0095 inch on the other. Furthermore, the tubing had hash marks 
remaining from the water cooling used during extrusion. 

The TE; attenuation coefficient of the dielectric-coated waveguide 
was measured by observing the 3-db bandwidth of a cavity containing 
the test section. Any cavity technique for measuring the attenuation 
coefficient implies the use of a considerable amount of auxiliary equip- 
ment. The cavity quality test set which was used, operated between 50 
and 60 Ge/s and used an M-1977 backward-wave oscillator.’ The cavity 
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quality was measured by sweeping the frequency of the oscillator through 
the cavity resonance and then observing the frequency difference between 
the half-power points of the transmission curve. The apparatus was made 
sensitive to dispersion rather than absorption so that the measurements 
would be relatively insensitive to drifts in amplifier gain. The cavity 
assembly consisted of a two-hole coupler constructed from helix wave- 
guide, the long section of waveguide to be studied, a 12-inch section of 
helix waveguide, and a piston as a shorting termination. The piston 
had a port for removal of air from inside the cavity. A short section of 
helix in the cavity was necessary in order to remove unwanted modes. 
Mode interference is manifest by observing markedly increased band- 
width values for some positions of resonance for the piston; thus, the 
minimum measured TE: bandwidth was selected as the most probably 
correct value. 

The long section of waveguide under test has a quality factor Qe given 
by the expression® 


1/Q2 = 20128 2/ (Bes? fe B). (1) 


The parameters a2 and $2 are the real and imaginary portions of the 
propagation constant of the TEo: wave in the waveguide section under 
consideration. The factor 8.2 is the free-space propagation constant 
corresponding to the TE: cutoff frequency. The reciprocal cavity qual- 
ity factor in terms of the measured 3-db frequency bandwidth Af, is 
1/Qm = Afm/f and is the sum of two portions. The first is that associated 
with the intrinsic attenuation of the waveguide, Afe/f = 1/Q2. The 
second is associated with the attenuation at the ends of the cavity, b/L, 
where b is a constant characteristic of the ends and L the length of the 
cavity. The resulting expression for the measured cavity quality factor is 


1 Afm  Afe , d 


ed get ee ae 2 
ha a 2 
The expression for the attenuation factor of the test section becomes 
_ Bea + Be" ( - *) 
a= 2Bof Afi re . (3) 


Unlined evacuated cylindrical copper waveguides were measured first 
in order to establish the end loss corrections and the loss characteristics 
of the bare copper pipe to be used later for lining and in making bends. 
These parameters were found by measuring the cavity bandwidth as 
a function of length. The intrinsic bandwidth of the test section was 
given directly from a plot of Af, versus 1/Z by the intercept and the end 
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loss factor calculated from the slope. The attenuation coefficients in 
lig. 3 of the bare copper pipes agree well with the results of King.’ 

The losses due to air were determined using a long section of unlined 
copper waveguide open to the atmosphere. These measurements were 
needed in order to correct the measurements taken on the dielectric- 
coated waveguide. The dielectric-coated waveguide could not be evacu- 
ated since an occasional small pocket of air between the liner and the 
copper wall caused the liner to collapse upon evacuation. The plot of 
the measured atmospheric losses versus frequency in Fig. 1 is in satis- 
factory agreement with the results of others.’ 

An initial check of the placement of the dielectric coating inside the 
guide was made by fabricating many sections of various lengths and then 
measuring the bandwidth of the cavity formed from each. The band- 
width measurements were made at a fixed frequency near 55 Ge/s (Fig. 
2). Of interest are the points associated with the bare pipe. These are 
not scattered appreciably, which means that all of the bare pipes are 
quite uniform. The circled points associated with the liner show con- 
siderable scattering. Apparently the liner was not in close enough con- 
tact with the copper wall. Any blister in the liner causes the lined pipe 
to be lossy. The liner contact with the wall was improved by alternately 
cooling and warming for several cycles. A new set of measurements were 
made, and the decrease in attenuation indicated by the triangular points 
was observed. Next, a 14-foot section of the shrunken dielectric-coated 
guide was measured across the frequency band from 50 to 60 Ge/s. The 
increase in the attenuation coefficient of the waveguide due to the liner 
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Fig. 1 — Attenuation of air in 0.875-inch i.d. waveguide. 
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Fig. 2— Cavity bandwidth as a function of reciprocal length. 


Aa; was calculated by subtracting the losses due to air a, and the meas- 
ured unlined waveguide aq, from the measured attenuation coefficient 
Qm at each frequency. The results for this section of coated guide are 
given in Fig. 3. 

The increase in the TE attenuation coefficient of the dielectric- 
lined waveguide due to the dielectric loss in the liner is given by Unger as 


skopu t° 
Aap = 1.835 X 10°22" " .” db/mile, (4) 
Boa? 
and the increase in copper loss due to the liner being in contact with the 
waveguide wall as 


Aas = ane’ — 1)k07t?. (5) 


In these equations, «* = e«’ — je” is the dielectric constant and ¢ the 


thickness of the coating, a the radius of the waveguide, ky the phase 
propagation constant in unbounded vacuo, poi’ is the first root of the 
Bessel equation Jo’(por’a) = 0, and Bo is the phase propagation coeffi- 
cient of the To. wave in the waveguide. Equations (4) and (5) are 
perturbation expressions for which 801 of the lined waveguide is assumed 
to be the same as that of the bare pipe. 

An analysis of the increase in attenuation coefficient of the dielectric- 
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Fig. 3 — Attenuation of coated and uncoated waveguide. 


coated waveguide due to the coating is dependent upon knowing the 
magnitude of the coating thickness. However, the inside and outside 
radii associated with the sleeve were slightly eccentric. A first approach 
to the problem of eccentricity can be made by assuming that the func- 
tions for calculating the attenuation coefficient in cylindrical geometry 
are changed only negligibly by the introduction of the eccentricity. In 
such case the average values for the quantities t, ??, and ¢? can be used, 


t= 4 (bmax ++ btn) (6) 
2 

5 - Cc 

a are (7) 

= 3 Bis 


where c is the distance of separation of centers 


ee eee aes (9) 
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Substitution of the measured values of tmax and tmin into (6), (7), (8), 
and (9) indicates that the arithmetic average is suitable for use in the 
calculations. The measured average wall thickness for the tubing used 
in this work was 0.0115 inch and the separation of centers was 0.0020 
inch. 

The values for Avy and Aa, were calculated for several frequencies 
and their sum is compared in Fig. 4 to the values measured for a long 
section of the dielectric-coated waveguide. There is apparently some 
conversion of TE: to other modes of propagation. 
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Fig. 4— Increase in attenuation Aa: of 0.875-inch id. copper waveguide due 
to linear. 


III. DIELECTRIC-COATED WAVEGUIDE BEND 


A typical bend was constructed to fit into an existing waveguide trans- 
mission system. This required that the angle of bend 6) be 92° and the 
length J, be 14.185 feet. In order to design a linearly tapered bend with a 
given bend angle and length, consider a bend with the curvature being 
an arbitrary function of the length of are z taken along the axis of the 
pipe K = f(z). The curvature of the bend described in the rectangular 
coordinate system (I',Z) of Fig. 5 is given by the expression 


K = E +. (2) lis ay (10) 


Of supplementary use is the equation for the differential length of arc, 
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Fig. 5 — Coordinate system for half of the bend. 


dz = E of (zy ] ds. (11) 


Equation (10) is solvable in terms of the length of bend under the condi- 
tions that at the origin z = 0, K = 0, and dI'/dz _|,-0 = tan 60/2. The 
solution is then: 


>= [ cos (3 + [ Kade) dz (12) 
0 2 0 
T= | sin (: + i Kis) dz. (13) 
0 2 0 
The chord length /, of a symmetrical bend of length & is 
1/2 6 z 
n= 2 [ cos € + [ Kaz) dz. (14) 
0 2 0 


When the curvature is made a linear function of length of arc, K = 
—k’z with k’ being a constant, the result is a Fresnel integral curve. 
The curvature parameter k’ can be evaluated in terms of design param- 
eters from the definition of the differential bend angle and the radius 
of curvature, 


da i she =. Se. (15) 


The half bend angle occurs over the half-length of the pipe; thus, 
k’ = 465/l?. (16) 
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Equations (12), (13), and (16) can be used to calculate the coordinate 
points for a Fresnel integral bend of arbitrary length and bend angle. 

A centerline curve was calculated and a bend constructed to fit the 
desired design parameters. A convenient technique for making a bend 
with linearly tapered curvature is to form the waveguide around a 
wooden jig. The waveguide is then clamped in place. The attenuation of 
the bend was measured using the resonant cavity technique described 
previously and the results are plotted in Fig. 6. The attenuation was 
found to decrease slightly with increasing frequency. Theoretical values 
for the attenuation of the bend can be calculated from the equations of 
Unger and Morgan? and are shown in Fig. 6. 


IV. CONCLUSIONS 


The straight waveguide with liner had a slightly greater TEo trans- 
mission loss than the bare pipe. Part of the loss, as suggested by Unger 
and Morgan, arises from dielectric heat loss and an increase in copper 
loss. The remaining portion of the added loss could probably be de- 
creased because the liner was not as smooth and did not fit as snugly 
to the wall as desired. Extrusion of the dielectric directly inside the 
waveguide should yield improved performance. 

The Fresnel integral bend as constructed had an insertion loss of 
about 0.1 db between 50 and 60 Gc/s, which is close to the theoretical 
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Fig. 6 — Total attenuation of the lined bend compared to straight section of 
equal length. 
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value. Lining the round waveguide removed the TE, and TMi mode 
degeneracy and most of the large loss associated with bending. This 
enabled the construction of a new and useful device. 


The author wishes to express his appreciation to D. H. Ring for his 


critical review of this article and to J. W. Bell and W. IX. Whitacre for 
aid in construction and test of the bend. 
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Estimation of the Mean of a Stationary 
Random Process by Periodic Sampling 


By H. T. BALCH, J. C. DALE, T. W. EDDY and 
R. M. LAUVER 


(Manuscript received February 23, 1966) 


Estimating the mean of a stationary random process from the average 
of equally weighted samples taken periodically in a closed interval (0, T) 
is considered. The variance of this estimator as a function of the number 
of samples in the interval is given in the form of a modified sampling theo- 
rem. 


I. INTRODUCTION 


This paper* considers the problem, commonly encountered in detec- 
tion theory, of estimating the mean of a stationary random process from 
samples taken periodically in a closed interval (0, 7’). The samples are, 
in general, correlated and the estimator used is the average of equally 
weighted samples taken in the interval. Existing results are extended to 
give a clearer interpretation of the dependence of the variance on the 
number of samples. The dependence is obtained in terms of the power 
spectral density of the process in the form of a modified sampling 
theorem. 


Il. THEORY 


2.1 General 
To estimate the mean value, A, of s(t) where 
s(t) = A+ n(e), (1) 


the first sample is taken at ¢ = 0, and a total of N + 1 samples is taken 
in time 7’. n(¢) is a sample function from a wide-sense stationary random 
process with mean zero and known autocorrelation function R(7). 


* This work was supported by the U. S. Navy, Bureau of Ships under contract 
No. Nobsr-89401. 
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The estimator of A is 
N 
A = [1/(N + 1] 2, s(mT/N). (2) 


For a fixed T,, N is to be chosen to minimize the variance of A. 
The variance of A is given by 





aa) =tyNe od (1th) eomrym). —@) 


Equation (3) may be found in slightly different form in the literature.”” 
It is now convenient to define a weighting function, q,,(7), by 


lel 
(7) _} a CSG (4) 


ee nee otherwise. 





With this definition, (8) may be written as 

1 si - mT 
a (A) = W+1) [ Quy iwir(7) R(r) oa 6 (« =2 m) dr, (5) 
where 6(7) is the Dirac delta function. It is more revealing to express 
the variance in terms of spectral densities; thus, we make the following 


definitions: 


Fw) = Woy [ Quvsy yy r(7) R(7r) exp (—jwr) dr 


(6) 
= Q&) ©S), 
where 
Qo) = a iz quvzyinyr(r) exp (—Jjwr) dr 
, , (7) 
_T ee [w(N + ay 
~ N\ fo(N + 1)/2N]T 
and S(w) is the spectral density of n(t). 
Also define 
Glo) = WoT [. duvsyinir(t) R(r) re 


exp (—jwr) 2 7) (- — nr) dr. 
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By using Poisson’s sum formula we can show that 


Glo) = 2 0 P(e m2N). (9) 
DF ie £ 
Now, comparing (5), (8), and (9), we observe that 

(A) = G0) = 3 & F(m Ny (10) 


Because Q(w) is approximately zero for 
[wo] 2 xr/T)[N/(N + 1)I, 


if S(w) is zero for | w| 2 27B, their convolution, F (w), will be approxi- 
mately zero for |w| < 27[B + N/T(N + 1)]. From this result and 
(10) we observe that choosing 


2rN N 

FR B+ nara | | | 
makes 

o'(A) = G(0) & (N/T)F (0). (12) 


Although the restriction of (11) appears to minimize the variance of 
A, it should be observed that F'(0) is also a function of N, namely 


vt 1) 
7 (0) = “i xf S(w) vn r]| d 





OT D ; ts) 


If 27/T is of the same order of magnitude as 27B, the bandwidth of 
S(w), then (N/7')F (0) and therefore the o”(A) may be minimized by 
choosing the smallest value of N satisfying (11). In such cases, making 
N larger may actually increase the variance, as illustrated in the ex- 
amples. (V/T)F (0) would be independent of N if the first sample is 
taken at ¢ = 7/N rather than ¢ = 0. Solving (11) for N yields an 
approximate rule that 


new prliMir (4/5?) vi ae, (14) 


where N is an integer, will minimize the variance of A. It should be 
recalled that the total number of samples taken in 7 is N + 1. 

The form of (9) is frequently encountered in sampling theory, where 
one sometimes thinks in terms of the original spectra, F'(w), shifted 
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by integral multiples of the sampling frequency, 2xN/7'. The Nyquist 
frequency is determined such that overlapping of the sideband spectra 
is small. This is too restrictive when one is interested in the variance, 
since then only the value of G(w) at w = 0 is of interest. Thus, sampling 
can be done at a rate sufficient to prevent overlapping of sidebands at 
w = 0. Equation (14) may be considered as a modified sampling theorem, 
stating that, to minimize the variance, the sampling frequency, f, , 
must satisfy 


ey eee CL (15) 


=F 


For large 7’, fs is equal to one half of the Nyquist frequency required to 
reconstruct the time function. 
2.2 Variance for Large T 


When T is sufficiently large, the Q(w) function approaches a delta- 
function, namely 


Q(w) & 278(w)/(N + 1), (16) 
and 
o(A) & [N/T(N + 1)] 2s S(m2aN/T). (17) 
If, as before, S(w) is bandlimited and 
ae: Ae ee 
2aN/T & 2n | B+ Ay |S eB (18) 
then 
o (A) = S(0)/T. (19) 


Notice that taking more than BT samples will not decrease the variance 
appreciably. Taking less than BT samples will increase the variance at 
a rate which depends on S(w). The dependence of the variance on N 
can be easily obtained for this limiting situation from (17). In general, 
if dS(w)/dw < 0 for w = 0, then the variance of A will also be a mono- 
tonically decreasing function of N for N/(N + 1) & 1. On the other 
hand, if the spectral density of the noise is not monotonically decreasing 
for w = 0, then the variance of A will have local minima for values of 
N < BT. These statements concerning monotonicity would be true for 
all T and N if (N/T)F(O) were independent of N. 
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2.3 Limit of Continuous Sampling 


The limit of continuous sampling has been derived elsewhere’ and is 
easily obtained from (3). The result is 


“Ae= 7 | (1 ~ ') R(r) dr =p | gels) R(x) dr (20) 


_ il sin wI'/2 
= a S(w ) (ere aT? ) dw. (21) 
As T becomes large, 

(Aer [ 2 o()de = SO (22) 


Thus, for large 7, taking N = BT samples gives the same variance as 
sampling continuously. 
2.4 Hquivalent Independent Samples 


When 7’ is large, one can determine the number of independent 
samples required to achieve the same variance as continuous sampling. 
The variance for NV; independent samples is 


a [. S(w)dw 


Qr 2 
o(A)w, eee N; at 
Equating this variance to the variance of (22) requires 
ede =~ f S(w)de oh 
S(0) 
Defining the effective bandwidth as 
—[ S(w)a 
oR, = Qe da” (25) 
: S(0) : 
(24) can be written as 
N; = 2B.T. (26) 


However, the variance achieved by continuous sampling may be ob- 
tained by taking N = BT samples in time 7. Thus, 


N; = (2B./B) (27) 


738 THE BELL SYSTEM TECHNICAL JOURNAL, MAY-JUNE 1966 


equates NV for minimum variance to the number of independent samples 
required to achieve the same variance. 


Ill, EXAMPLES 


The variance of the sample mean as a function of number of samples 
(N + 1) and length of record (7) has been computed for several spectral 
densities. 

The variance of the sample mean shown on the following figures was 
computed using (3). 


3.1 Rectangular Spectrum 


(4, —247 <w < 2Qr 
Si(@w) = | (28) 


0, elsewhere. 


Fig. 1 shows o’(A) plotted against number of samples. Each curve 
of the set represents a different length of record 7. The table on the 
figure shows the relationship of the curves to the length of record. 

The most striking feature of the curves on Fig. 1 is the abrupt steps 
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Fig. 1— Variance of the sample mean as a function of the number of samples 
and length of record for a process with rectangular spectral density. 
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in o (A) as N is increased. This behavior is predicted by (17). Equation 
(14) predicts the approximate value of N for minimum variance. These 
values of N are shown for each of the curves by a small circle. 

An interesting point to note here is that in some regions a better 
estimate of the mean is obtained when the same number of samples is 
taken for a smaller 7’. Also for small 7’, o°(A) reaches a minimum and 
then increases aS more samples are taken. This implies that for small 
values of 7’ a smaller variance is obtained by taking a smaller number of 
samples (but including the end points) than would be obtained by 
continuous sampling. 


3.2 Sawtooth Spectrum 








wo —27 Sw S27 
2 ? 

Se(w) = 41°" (29) 
0, elsewhere. 


This is an interesting case for two reasons. First, its spectrum is 
not monotonically decreasing. This gives rise to local minima and 
maxima in o (A) as a function of N caused by the spectrum shape. 
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Fig. 2— Variance of sample mean as a function of number of samples and 
length of record for a process with a sawtooth spectral density. 
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Second, its spectrum at w = 0 is 0, thus enhancing the error due to 
approximating Q(w). The results are shown in Tig. 2. 


3.3 Markoff Spectrum 


8 

ow + 16° 

This is an example of a nonbandlimited spectrum. The values of 
o° (A) are shown in Fig. 3. A point worth noting here is that if the band- 
width of the process was defined as the width at the one-half power 
points and the time function sampled according to (14), the value of 
o?(A) obtained would be larger by about a factor of 2 than the minimum 
value obtained by letting N approach infinity. 

This example is also the same one treated by Fine and Johnson} for 
small values of 7. Curve A on Fig. 3 agrees with their results. 


83:(w) = (30) 


IV. SUMMARY 


Theory has been presented which predicts the behavior of the variance 
of the sample mean of periodic samples taken from a stationary random 
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Fig. 3— Variance of the sample mean as a function of the number of samples 
and ieasth of record for a process with Markoff spectral density. 
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process. The variance is given in terms of the power spectrum of the 
sampled process. Three interesting results have been shown: 
(2) When BT > 1, the variance of the sample mean is essentially 
minimized when BT samples are taken. 

(72) The variance of the sample mean is not necessarily monotonically 
decreasing as a function of the number of samples taken in a 
fixed record length. 

(iz) For short record lengths, it is possible to obtain a smaller vari- 
ance with a small, finite number of samples than with continuous 
sampling. 
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Physical Limitations on Ray Oscillation 
Suppressors 


By D. MARCUSE 


(Manuscript received January 12, 1966) 


The question of whether it ts possible to suppress ray oscillations in light 
waveguides is important for the design of light communications systems. 
With the help of Liowville’s theorem of statistical mechanics it is shown that 
it is impossible to reduce simultaneously the amplitudes and the angles of 
ray oscillations if the ray originates in and returns to a region of low index 
of refraction. A reduction of both ray amplitudes and angles can be achieved 
only if the ray moves from a region of low to one of high index of refraction. 

Liouville’s theorem is used to derive a condition relating the output posi- 
tion and slope of a ray which traverses an optical transformer to its input 
position and slope. With p; , x; denoting the canonically conjugate variables 
of the output ray and p;, x; those of the input ray, the condition derived 
from Liouville’s theorem states that the Jacobian of the transformation is one. 

O(pi , Xi) _ 


Ad cE SoZ | 


O( pi, Ui) 


I. INTRODUCTION 


Light transmission systems can be built in various ways. A continuous 
dielectric medium of rotational symmetry with an index of refraction 
which depends on the distance r from the optical axis 


n = n(r) 


is capable of guiding light rays if n(r) decreases monotonically with in- 
creasing r. Another example is the beam-waveguide consisting of a 
series of lenses which refocuses the light beam periodically counteracting 
diffraction. 

Both of these examples have one point in common — a ray which is 
launched off-axis into the waveguide follows an oscillatory trajectory. 
However, even if a light ray travels on-axis it will be forced into an 
oscillatory trajectory by any imperfection of the guidance medium.! To 
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keep the ray amplitudes small requires a very high precision of align- 
ment which might be hard to obtain for long waveguides. 

It seemed natural, therefore, to consider means of suppressing these 
ray oscillations, and if all such efforts fail, to ask for a general physical 
principle which says that such ray oscillation suppressors are impossible. 

The search for such a general principle is even more important as it is 
easy to construct models of beam waveguides which violate physical 
principles in subtle ways thus seeming to lead to ray oscillation suppres- 
sors. One such system is shown in Fig. 1. Assume that we deform thin 
lenses as indicated in the figure and assume further that these lenses 
behave just like plane thin lenses in that they break each ray by an 
amount 6, which depends only on the radius 7, of the ray ,but not on 
the input angle. \ 


tan Bn = —r,/f. 


Making the paraxial approximation, which means replacing tan 6, by 
B, and tan an by an, we obtain the ray equation a 


Tnt1 = Tn + On(2n41 = ae, (la) 
An+1 = An — oa (1b) 


If the lenses are warped to form parabolas, st 
enti — fo = A+ Or? — Tayt?). | (2) 


Equations (1a) and (1b) together with (2) describe rays which, if they 
travel from the left to the right in Fig. 1, exhibit decreasing amplitudes. 
In fact, if one allows each ray to travel a sufficient distance they approach 
the axis arbitrarily closely. 

It appears that we have invented a ray oscillation suppressor. 

The object of this paper is to prove that such a device is impossible. 
So the question arises: What went wrong with the argument presented 
above? A closer examination shows that the assumption that 8, is 


Bn an+i 





LENS: Nn nti 


Fig. 1— Beam-waveguide composed of warped, thin lenses. 
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independent of a, violates Liouville’s theorem. We will return to this 
question later. 

The general proof of the impossibility of constructing a ray oscilla- 
tion suppressor was suggested by J. R. Pierce. 


II. PROOF OF THE IMPOSSIBILITY OF A RAY OSCILLATION SUPPRESSOR 


The proof is based on Liouville’s theorem.’ It refers to the representa- 
tion of physical systems in phase space. Phase space is the space of the 
canonically conjugate variables q; and p; describing the system. Each 
system is represented by one point in phase space. Many identical sys- 
tems which happen to be in different states described by different values 
of their coordinator q; and p; can be described by the density of their 
representation points in phase space. Liouville’s theorem states that the 
density of any given configuration of points in phase space is constant 
if the systems under consideration obey the canonical differential equa- 
tions 

dq: oH dp; oH 
Go on, ae oy ” 
Di t Ogi 
His the Hamiltonian function describing the system. Another version 
of Liouville’s theorem states that the volume containing a constant 
number of points in phase space remains constant in time. 

For Liouville’s theorem to be applicable to light rays we have only to 
show that light rays can be described by equations of the form (8). The 
derivation of the Hamiltonian equations of geometric optics can be 
found in Ref. 5. The derivations are sketched here for the sake of con- 
venience. 

To show this we start with Fermat’s principle which states that a 
light ray connecting two arbitrary points Pi and P2 in a medium of index 
of refraction 

n = n(x,Yy,2) (5) 


follows a path such that 


1 f”? 

= a nds = extremum. (6) 
Cc Py 

Here, c is the velocity of light in vacuum and s is the path length meas- 
ured along the ray trajectory. Introducing coordinates z,y,z, we can 


rewrite (6) as 


Pe 
\ al mV1i+ 2? + y? dz = extremum (7) 
Py 
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with 


_ dz r _ dy 
Ue and y =7). (8) 
Equation (7) is analogous to Hamilton’s principle of least action with 
the Lagrangian 


L=nVi+224+ y? (9) 


and the time ¢ being replaced by the z-coordinate. 
Once the Lagrangian of a system is known the moments p, and p, 
canonically conjugate to the x and y coordinates are defined by 


, 





aL x 
a Vie eae (10a) 
aL y" 
Pe ay V+ a $y? oe 
and the Hamiltonian function by 
H = px + py -L=— Vii — pe — Dy (11) 
The variational problem (7) is solved by the equations? 
’ oH 1 OH 
= es (12a) 
ODe y Opy 
’ oH ’ oH 
pz = = ae Py = ~ By" (12b) 


Equations (12a) and (12b) are analogous to (3) which shows that the 
ray description can be given in terms of canonical differential equations. 
The equations of (12a) are satisfied identically while the equations of 
(12b) lead to the well-known ray equations 








Vite? +y?dz\V Vita? +y?) ox 
1 d y’ an 
Toe) ~ Oy! (13b) 
Introducing 
= Vite Ty (14) 


(13a) and (13b) can be written in the more familiar form‘ 
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df dx dn 
d dy\ _ on 


The preceding discussion of ray dynamics was sketched only to prove 
that Liouville’s theorem applies to light rays. 

Now, we are finally in a position to prove the impossibility of a ray 
oscillation suppressor. To simplify the discussion let us limit the problem 
to two dimensions, x and z. Assume that z is the axis of the system. The 
phase space is now two dimensional and is spanned by the coordinates 
x and p,. Let us further assume that we consider an ensemble of rays 
whose initial conditions are such that the representation points of all 
these rays fill a square area centered around the origion of phase space 
as shown in Fig. 2. Each ray represented in this area has a certain dis- 
tance x from the optical axis z and a certain slope given by (10) 


, Pz 

o = Veo pe ou 
If an oscillation suppressor were possible we would require that all the 
rays initially contained in the square of phase space of Fig. 2 would 
approach the z-axis more closely. In addition, we would require that 
the angles between the rays and the z-axis don’t increase or perhaps 
even decrease. If we look at the rays initially and finally in a region of 
constant index of refraction n for example in vacuum, n = 1, we would 
find that the square of Fig. 2 has deformed either into the rectangle, if 
the angles don’t shrink, or into the smaller square, if the angles as well 
as the amplitudes shrink, as indicated in Fig. 2 by dotted lines. In either 





Fig. 2— Volume in phase space occupied by light ray representation points. 
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case, we find that the area (volume in two dimensions) of phase space 
occupied by the points representing the initial ray positions has de- 
creased. However, Liouville’s theorem says that this is impossible so 
that we see that a ray oscillation suppressor is impossible. We can trade 
off amplitude at the expense of spread in p, direction. In this case, either 
the tangent of the ray angles x’ or the index of refraction has to increase. 
It is even possible to decrease both the ray amplitudes and angles by 
increasing ” along the z-axis. However, the area in phase space has to 
stay constant. The initial square has to deform into a rectangle of equal 
area which stretches along the p, axis. After some distance we have 
reached a region of high index of refraction and find both the amplitude 
and the ray angles decreased (but not the p, values which have in- 
creased). For most applications to be able to make use of the effect, we 
have to leave the high index medium. But as soon as n drops to a low 
value the angles have to increase to keep the spread in p, constant and 
again we have traded a decrease in the ray amplitudes for an increase 
in the ray angles. The ray position in most optical systems will eventually 
spread far apart if we allow the rays to travel far enough. This means 
that the volume in phase space, though its volume content remains 
constant, assumes a “filamentous” appearance and extends to many 
different parts of phase space.? 


Ill. A BASIC RELATION FOR OPTICAL TRANSFORMERS 


Liouville’s theorem allows one to formulate a theorem which all rays 
passing through an optical device (optical transformer) have to obey. 

Let us assume we have an arbitrary optical transformer with input 
rays whose positions and slopes are described by the canonically con- 
jugate variables g;,p; and corresponding output ray with variables 
q:, p;. The output variables are related to the input variables by 


Gi = ai(qi , Di) 
Pi = Pi(qi, i). 


The input rays may occupy a volume dV = dqidq:dp:dp: in phase space. 
This volume deforms, as the rays propagate, to dV. Liouville’s theorem 
states that these volumes are identical: 


dV = av. (17) 
The volume dV on the right hand side of (17) can be rewritten as 


= _ 0(q:, Di) 1 
dV = as, p:)? dq: dqz dp; dpe (18a) 
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or 


ay = 204» Pd) gy, (18b) 
(gi 5 Di) 
We conclude from (17) and (18) that the Jacobian must be equal to 
unity 


O(q:, p:) = 1 
asp) oe 


Equation (19) is stated in Ref. 6 without proof. 

The derivation of (19) is based on the fact that the ray trajectory can 
be described by the differential equations of (13). However, there may 
be discontinuities in the index of refraction, n, where the ray equations 
can not be applied. But it is well known that rays traverse discontin- 
uities of the index of refraction. The ray trajectory is unaltered if the 
discontinuity is replaced by a rapidly changing but continous transi- 
tion of n. In this way we assure that the ray equations hold everywhere 
and that (19) is applicable even in that case. 

Limiting the problem to two dimensions we can write (19) as 


——— ——— =1, (20) 


Equation (20) allows us to derive an interesting relation between the 
input and output angles of rays passing through an infinitesimally thin 
optical transformer (Fig. 3). If the thickness of the optical transformer 
shrinks to zero we have x = x and consequently, 





OPTICAL snr 
TRANSFORMER 


Fig. 3 —lIllustrations of a thin optical transformer. 
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ap = 0 a 1 
so that (20) reduces to 
x i 
whose solution is 
p= pt f(). (21) 
With the help of (10) we see that if 
= tana — = tana, 
it follows that 
p=nsina p=n”nsin& 
so that (21) can be written as 
Asin& = nsina + (Asin &) a0. (22) 


This is a fundamental relation which all rays passing through thin lenses 
or any other thin optical device have to obey. 
If botha «K land a<«<1 anda’ = n = 1, (22) simplifies 


B=&—a= (@ao. (23) 


This is the relation which is used to describe the change of ray angles 
passing through a thin lens. Equation (22) shows that this thin lens 
relation holds approximately for rays which impinge nearly perpendicu- 
lar to the lens. If the rays make large angles with respect to the direction 
normal to the lens surface (23) has to be replaced by (22). This explains 
the error which was made in deriving (1). If this equation is corrected 
by using (22) rather than (16) the ray oscillation suppressing quality 
of the warped thin lenses disappears. 

The general expressions (19), (20), or (22) can be used to check the 
physical realizability of optical models. 
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A New Signal Format for 
Efficient Data Transmisson 


By F. K. BECKER, E. R. KRETZMER, 
and J. R. SHEEHAN 


(Manuscript received February 18, 1966) 


I. BACKGROUND 


Data communication systems in current use generally require sub- 
stantially more bandwidth than the Nyquist minimum of one-half cycle 
per symbol. This comes about for two main reasons: first, the baseband 
signal spectrum has a gradual roll-off beyond the theoretical minimum;} 
second, the modulation process needed to translate the baseband spec- 
trum to the bandpass channel generates additional side frequencies 
which must be preserved to permit recovery of the signal. For example, 
a recently described vestigial-sideband system? uses an extra 50 per 
cent of bandwidth for each of these two reasons. Consequently, such a 
system handles one symbol per cycle, and each symbol can convey as 
many levels as the signal-to-noise ratio permits-—— independent of all 
adjacent symbols. Thus, with 7 levels, each symbol yields log, n binary 
digits. 


II, NEW TECHNIQUE 


A new approach recently implemented avoids the need for excess 
bandwidth by using baseband shaping such that the received signal 
spectrum is a half-period sinusoid.*+ This shaping not only permits 
two symbols per cycle of bandwidth, but it also forces the baseband 
signal to be free of any de component. This, in turn, permits single- 
sideband techniques for translation to any desired frequency band 
without increase in bandwidth. 


III. UNDERLYING PRINCIPLE 


The new technique is a departure from the conventional methods 
which are based on zero intersymbol interference.’ Instead, it permits 
intersymbol interference — but in precisely prescribed amounts. This 
is best illustrated by examining the impulse response for each case — or, 
more accurately, the end-to-end response to a single symbol (e.g., a 
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Fig. 1— Conventional system — pulse response and frequency domain function. 


“one” in a background of all zeros). Fig. 1 shows the response of a 
conventional system, alongside of the corresponding frequency-domain 
function. Fig. 2 shows the corresponding functions for the new system. 
In both cases, the spacing between successive symbols is 7’, but only 
in the second case is the bandwidth confined to 1/27. 

The fact that the symbol response, as shown in Fig. 2, extends over 
several symbol intervals requires compensating decoding at the receiver 
or, advantageously, precoding at the transmitter® similar to “duobinary”’ 


JA(f)| = sinewTf 
e(f) =F 





2T ae 


Fig. 2— Partial response system — pulse response and frequency domain function. 
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or “biternary” coding.” Performance with either method is comparable 
to that achieved with other three-level systems.° The precoding used in 
the present implementation converts the original binary data sequence 


@y , M2 ++ Gy» into a new binary sequence b; , be --- b, , which the channel 
converts into the received three-level sequence c) , €2 -++ Cn . The follow- 
ing relations hold 
Cn = bn — On—2 (1) 
(by definition of the system response) 
Qn = [b»z -+ b,-2] mod 2 (2) 


(by design of the precoder). 


It follows that a, = [c,] mod 2, which means odd and even-numbered 
levels of c, signify a, = 1 and zero, respectively, the same as with biter- 


ae 
SHIFT 


Fig. 3— Transmitter precoding system. 


nary/duobinary encoding. The precoding relation (2) is implemented 
with a mod-2 adder and a shift register (see Fig. 3). 


IV. MODULATION PROCESS 


Of all the known methods for translating a signal into a desired fre- 
quency band only single-sideband transmission preserves the signal 
bandwidth. This is illustrated in Fig. 4; the numbers correspond to an 
experimental Data-Phone* set presently being tested on the switched 
telephone network at 2400 bits/sec. 

In this instance the transmitted spectrum covers exactly one octave. 
It is generated by simply sampling the data and selecting the desired 
spectral band with a filter approximating the square root of the half- 
sinusoidal characteristic H(f). A matching filter at the receiver then 
completes the shaping shown in Fig. 4. The carrier pilot is transmitted 
outside of these filters; it is recovered through a narrow-band filter (just 
wide enough to preserve any multiplicative noise imparted by the 


* Data-Phone is a service mark of the Bell System. 
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|H(#)] = sina (E42) 


o(f) =o 





fe) 1.2 2.4 
FREQUENCY, f, IN KILOHERTZ 


Fig. 4— Partial response spectrum for single-sideband transmission. 


channel). The carrier phase is rotated by 90 degrees before it is used 
for demodulation, corresponding to odd symmetry of A(é) (see Fig. 2). 


V. EQUALIZATION 


Extensive computer simulation has shown that the system can 
tolerate substantial amplitude and phase distortion of various shapes. 
Since this tolerance is obtained at the expense of noise margin, it proved 
desirable to incorporate a limited amount of automatic equalization’ 
for operation on the switched telephone network. This has been ac- 
complished by formulating a new algorithm for automatic equilization 
of partial-response signaling formats.® 


VI. SUMMARY 


A data terminal incorporating the above principles has successfully 
performed in preliminary tests over a variety of cross-country dialed 
telephone connections. The data rate was 2400 bits/sec. In addition, 
a 150 bit/sec channel was operated in the reverse direction in the band 
below 1000 c/s. Details concerning design and performance will be 
published after evaluation is complete. 
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Performance of a Forward-Acting Error-Control System 
on the Switched Telephone Network 


By E. J. WELDON, Jr. 
(Manuscript received March 7, 1966) 


This brief contains a summary of data taken in the course of a recent 
error-control experiment. In this experiment data were transmitted 
over switched voiceband telephone lines at 2000 bits per second using 
Data-Phone* data set 201A. With the transmitting terminal located at 
the Holmdel, New Jersey laboratory, connections via the switched 
network were established to various cities (Baltimore, Cleveland, Dallas, 
Denver, Louisville, and St. Louis). There a return connection, again 
via the switched network, was established to the receiving terminal 
which was also located at Holmdel. Nearly all calls were made during 
the business day. 

Errors were corrected by means of a forward-acting cyclic code which 
was formed by interleaving the (15,9) code generated by 2° + 2° + 
x + 1 to degree 7. As a result, (9/15)-2000 = 1200 information bits 
per second were transmitted. In the first half of the experiment, 7 was 
set to 73; in the second half, 200. Since the (15,9) code corrects all 
bursts of length three or less, the interleaved code can correct all bursts 
of length 32 or less. Thus, the codes are optimal burst-correctors in the 
sense that the equality holds in the Reiger bound,’ i.e., 


Oe og 


: 2 





where 6 is the guaranteed burst-correcting ability of the code, n is the 
code length, and & is the number of information symbols in the code. 

In each case, the 2 subwords were decoded independently using the 
Peterson algorithm.’ Decoding in this manner, rather than using the 
Peterson algorithm to decode the cyclic code of length 157 directly, 
enables the code to correct many error patterns which would otherwise 
be uncorrectable. This improves performance considerably since it 
enables the code to correct most error patterns of low weight (2,3,4, --- , 
10, say) even though its minimum distance is only 3. It also simplifies 
the decoding circuitry somewhat. In this case the decoder employed 
approximately 210 transistors and a 14(z — 1)-bit, limited-access, line- 
speed storage device. 

The data are summarized in Table I. Because there does not seem to 
exist a single adequate measure of performance for such systems, the 


* Data-Phone is a service mark of the Bell System. 
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TasBLe I—Summary or Data 





Interleaving degree, 7 73 200 
Code length, n = 157 1095 3000 
Number of information symbols, k = 97 657 1800 
Burst-correcting ability, 6 = (n — k)/2 219 600 
Number of calls 126 279 
Number of hours 259 284 
Number of bits transmitted 1.9-109 2.0-109 
Line bit error rate 5.7-10-§ 1.2-10-5 
Number of line errors* 3613 5171 
Number of delivered errors* 52 24 
Improvement factor 70 215 
Mean time between delivered errors* (hours) 5.0 11.9 
Delivered error rate (errors* per bit) - 2,7-10°8 1.2-1078 
Number of line word errors 3972 8704 
Number of delivered word errors 83 59 
Word improvement factor 48 150 
Delivered word error rate (word errors per word) 4.8-1075 8.7-10°5 
Number of line bit errors 10607 24472 
Number of delivered bit errors 2109 1703 
Bit improvement factor 5 14 
Delivered bit error rate (bit errors per bit) 1.1-10-6 8.5-1077 


results are presented in terms of three different types of error rate. 
These are based on bit errors, n-bit word errors, and errors.” The utility 
of the first two performance measures is apparent; however, in many 
situations the third is the most appropriate. For example, the mean 
time between errors” is the average duration of error-free intervals of 
useful length. This is a meaningful figure-of-merit for data users who 
require perfect transmission almost all of the time and for whom the 
cost of an error is relatively insensitive to the duration or the bit-error- 
density of the error. It is of interest to note that, regardless of how 
measured, the improvement in performance attributable to the error 
control system appears to increase approximately linearly with the 
degree of interleaving. Also the average line bit error rate encountered 
in this test was quite close to values reported in similar experiments on 
telephone facilities. 

The author wishes to thank the following individuals for their co- 
operation throughout the course of the experiment: G. 8. Robinson, 
who built the original error-control system (with 7 = 73); P. Mecklen- 
burg, who changed the interleaving degree to 200 and suggested several 





* An error is defined as a sequence of words all of which contain at least one bit 
error. 
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improvements in the system; and A. R. Lingenfelter, who was responsible 
for recording and reducing the data. 
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A Note on a Type of Optimization Problem 


that Arises in Communication Theory 
By I. W. SANDBERG 
(Manuscript received March 16, 1966) 


-'A problem that has arisen’ in connection with the use of transversal 
filters to reduce the effect of intersymbol interference in digital com- 
munication systems is to determine a real N-vectorc & (c1,¢2, +++ , ew) 
such that, with mo ¢ F & {1,2, --- , N}, 


n=—0 
n¥ng 





7é 
is minimized subject to the constraint 
l= pa CjXLny—j + (2) 
ves 


Here {x,}_% denotes a set of real constants such that | 20 | > D> nx0 | an |. 
Lucky’ has proved the interesting theorem that the optimal choice of ¢ 
coincides with the unique solution’ of the equations 


1= p> CiXng—j 
(3) 


0 = >> cits, nes — {no}. 
75 
The proof of Ref. 1 consists of establishing a contradiction to the asser- 
tion that (1), with c,, eliminated with the aid of (2), is minimized for 
some c for which (3) is not satisfied. The reader is referred to Ref. 1 
for the details. 

The purpose of this note is to show that Lucky’s result, and far more 
general results of similar type, can be directly deduced from the follow- 
ing proposition. | 
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Proposition: Let f* and ®(f*), respectively, denote an abstract element 
and a set such that f* z @(f"). Let s 4 {f*} U @(J*), and let Q denote 
a mapping of § into the set of nonnegative numbers. Let S denote a 
normed linear space with norm || - ||, and let R denote a mapping of 
S into So. Let 


o(f) © Of + || Rf || 
for all f e 8. Suppose that 


(i) Qf* = 0 
(ii) for allg e R(f*), 
Qg = || Rg — Rf* |. (4) 
Then for all f « 8, 
o(f) 2 o(f*) (5) 


and, if (4) holds with strict inequality for all g e ®(f*), then (5) holds 
with strict inequality for all f e $8 except f = f*. 


Proof: Let f ¢ R(f*). Then 
o(f) — of") = Of + [Rf] — Of — | RM] 
= Of + || Rf ll — || Rf || 
= Of — | Rf — Rf" |, 


from which the validity of the proposition is evident. 


An Application of the Proposition 


For eachj ¢ § & {1,2, --- , N}, let {x,,} _2 denote a set of real num- 
bers such that |2;;| > >> |a,,;|. Let ¥ denote a proper subset of F 
ny 


containing at least one element, and let {a,|n ¢ ¥} be a set of real 
numbers. Consider the problem of determining a real N-vector c & 
(cy, C2, °**,¢y) such that 


foo} 


eS 


n=—0 
ngs’ 


Ds CjXUnj 


veF 








is minimized subject to the constraints 


On = > Cn; , nes. (6) 
1eF 


Our assumption that |2,;;| > >> | an; |for 7 ¢ ¥ implies’ that there 
nxy 
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exists a unique solution c* to the set of equations 


an = Cin; nes 
7eF 

0 = dicwrns,ne (F —F). 
7eF 


We shall prove that if ¢ * c* and ¢ satisfies the constraints of (6), 
then 6(c) > 6(c*). For the special case in which § contains a single 
element, this result can be proved’ with a modification of Lucky’s 
technique. 

Let ®(c*) denote the set of all real N-vectors g, except the vector 
c*, such that 


/ 


an = dD aini nes. 
7eF 


Let Q be the mapping of 8 & {c*} U @(c*) into the set of nonnegative 
numbers defined by 





Qu = |> Vjenj 
ne(F—-F’) | 7eF 
for allve 8. 
Let S denote the linear space of vectors w = (+++, U4, UW, Uva, 
Uny2,°°°) With norm 


le = Dla 

and let R denote the mapping of § into So defined by 
(Rv)n = DVjtnj 5 nes 
To 

for all v e 8. Then we have 

d(c) = Qe + || Re | 
for all c ¢ S. Since Qc* = 0, if 

Qg > || Rg — Re* || 
for all g e R(c*), that is, if 


p= 2 , p= JiXnj 
n¢(F—-F ) 


14S 





->» | Dd (9; — ¢*)atnj| > 0 (7) 


neF | 7#F 


for all g e R(c*), then, by the proposition, (c) > 8(c*). To show that 
(7) is satisfied, observe that 
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y QjXnj > (gi — €;* Sn; 


yeF 7eF 














ne(F—-F’) 


= Do | 2, i — ¢*) a3 


neg | FEF 
for g € R(c*), and that, with w; & (g; — ¢,*), 


» 2 Wjtnj| 2 2 2 | Watton | — 2 Dy | ws || ens | 
7 nr es 


ne(F—-F') 




















nes 
and 
dX >, Witnj} S 2, 2, | v5 |-| ani 
ne 7eF n¢F T&F 
Therefore, 
eo 
neg k=—0 neG 


2 Da | wn | (Lean | — 22 | aan |), 
ne kAn 


(8) 


which completes our proof, since the right side of (8) is positive for all 


geR (c*) 
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